Increasing Performance for 4K video
Posted: Wed Jul 19, 2017 9:52 pm
I'd like to know what things can be improved in my system to increase noise reduction performance in 4K video?
Which has the greatest impact?
1. CPU Cores?
2. CPU Speed?
3. CPU Memory Bandwidth?
4. CPU Memory Amount?
5. CPU L1 cache size?
6. CPU L3 cache size?
7. CPU L3 cache size?
8. CPU Instruction Set (SSEx,AVX, FMA, etc.)?
9. GPU computational units (shading unit quantity)?
10. GPU Memory Bandwidth?
11. GPU Memory Amount?
12. GPU Instruction Set (OpenCL or CUDA version).
13. GPU Quantity?
13. Storage Speed in MB/Sec.
I am planning a system build, and after looking at performance gains when adding multiple GPU's, there is little performance between one and 4 GPUs in a system. System memory seems to be of little help, as does having dozens of CPU cores. To sum it up, I doesn't want to have a $6000 computer that runs Neat Video as fast as a $1000 PC.
Is this an architecture limitation? Is there a plan to increase the noise reduction scaling when multiple cores and multiple GPUs are available?
On a more technical note, does your algorithm have an O^2 or O^N operations, particularly when using temporal filtering. Are there dependencies in the calculations that prevent processing through parallelism?
Which has the greatest impact?
1. CPU Cores?
2. CPU Speed?
3. CPU Memory Bandwidth?
4. CPU Memory Amount?
5. CPU L1 cache size?
6. CPU L3 cache size?
7. CPU L3 cache size?
8. CPU Instruction Set (SSEx,AVX, FMA, etc.)?
9. GPU computational units (shading unit quantity)?
10. GPU Memory Bandwidth?
11. GPU Memory Amount?
12. GPU Instruction Set (OpenCL or CUDA version).
13. GPU Quantity?
13. Storage Speed in MB/Sec.
I am planning a system build, and after looking at performance gains when adding multiple GPU's, there is little performance between one and 4 GPUs in a system. System memory seems to be of little help, as does having dozens of CPU cores. To sum it up, I doesn't want to have a $6000 computer that runs Neat Video as fast as a $1000 PC.
Is this an architecture limitation? Is there a plan to increase the noise reduction scaling when multiple cores and multiple GPUs are available?
On a more technical note, does your algorithm have an O^2 or O^N operations, particularly when using temporal filtering. Are there dependencies in the calculations that prevent processing through parallelism?