I think NeatVideo doesn't take any benefit from Double-precision calculation,
if there is no further suggestion I think I will take a 590 for sure.
Code: Select all
CUDA-Z Report
=============
Version: 0.5.95
http://cuda-z.sourceforge.net/
OS Version: Windows AMD64 6.1.7601 Service Pack 1
Core Information
----------------
Name: GeForce GTS 450
Compute Capability: 2.1
Clock Rate: 1566 MHz
Multiprocessors: 4
Warp Size: 32
Regs Per Block: 32768
Threads Per Block: 1024
Watchdog Enabled: Yes
Threads Dimentions: 1024 x 1024 x 64
Grid Dimentions: 65535 x 65535 x 65535
Memory Information
------------------
Total Global: -2048 MB
Shared Per Block: 48 KB
Pitch: 2.09715e+06 KB
Total Constant: 64 KB
Texture Alignment: 512
GPU Overlap: Yes
Performance Information
-----------------------
Memory Copy
Host Pinned to Device: 5646.34 MB/s
Host Pageable to Device: 2622.25 MB/s
Device to Host Pinned: 5682.86 MB/s
Device to Host Pageable: 3574.45 MB/s
Device to Device: 7036.69 MB/s
GPU Core Performance
Single-precision Float: 398876 Mflop/s
Double-precision Float: 50088.9 Mflop/s
32-bit Integer: 200119 Miop/s
24-bit Integer: 198782 Miop/s
Generated: Wed Mar 14 20:05:38 2012
Code: Select all
CUDA-Z Report
=============
Version: 0.5.95
http://cuda-z.sourceforge.net/
OS Version: Windows AMD64 6.1.7601 Service Pack 1
Core Information
----------------
Name: GeForce GTX 550 Ti
Compute Capability: 2.1
Clock Rate: 1820 MHz
Multiprocessors: 4
Warp Size: 32
Regs Per Block: 32768
Threads Per Block: 1024
Watchdog Enabled: Yes
Threads Dimentions: 1024 x 1024 x 64
Grid Dimentions: 65535 x 65535 x 65535
Memory Information
------------------
Total Global: 1024 MB
Shared Per Block: 48 KB
Pitch: 2.09715e+06 KB
Total Constant: 64 KB
Texture Alignment: 512
GPU Overlap: Yes
Performance Information
-----------------------
Memory Copy
Host Pinned to Device: 5690.54 MB/s
Host Pageable to Device: 2580.25 MB/s
Device to Host Pinned: 5694.54 MB/s
Device to Host Pageable: 3559.29 MB/s
Device to Device: 35197.9 MB/s
GPU Core Performance
Single-precision Float: 464256 Mflop/s
Double-precision Float: 58291.8 Mflop/s
32-bit Integer: 232898 Miop/s
24-bit Integer: 231269 Miop/s
Generated: Wed Mar 14 20:40:30 2012
Code: Select all
CUDA-Z Report
=============
Version: 0.5.95
http://cuda-z.sourceforge.net/
OS Version: Windows AMD64 6.1.7601 Service Pack 1
Core Information
----------------
Name: Tesla C2050
Compute Capability: 2.0
Clock Rate: 1147 MHz
Multiprocessors: 14
Warp Size: 32
Regs Per Block: 32768
Threads Per Block: 1024
Watchdog Enabled: Yes
Threads Dimentions: 1024 x 1024 x 64
Grid Dimentions: 65535 x 65535 x 65535
Memory Information
------------------
Total Global: -1408 MB
Shared Per Block: 48 KB
Pitch: 2.09715e+06 KB
Total Constant: 64 KB
Texture Alignment: 512
GPU Overlap: Yes
Performance Information
-----------------------
Memory Copy
Host Pinned to Device: 5759.45 MB/s
Host Pageable to Device: 2524.57 MB/s
Device to Host Pinned: 5689.37 MB/s
Device to Host Pageable: 3452.75 MB/s
Device to Device: 50980.1 MB/s
GPU Core Performance
Single-precision Float: 1.02294e+06 Mflop/s
Double-precision Float: 512848 Mflop/s
32-bit Integer: 513210 Miop/s
24-bit Integer: 512631 Miop/s
Generated: Wed Mar 14 20:53:44 2012