v3 Performance?

general questions about Neat Video
Post Reply
patrick
Posts: 8
Joined: Tue Dec 04, 2007 10:56 pm

v3 Performance?

Post by patrick » Thu Sep 15, 2011 10:25 am

I'm interested in the performance gains of v3. Did anyone benchmark it? I read the promo, but it doesn't make clear how much gain I should expect cleaning 1080p video (max temporal resolution, non adaptive).

I can benchmark the demo using a lower res video, but does resolution scale linear? So double the resolution = double the processing time?

Thanks in advance.

NVTeam
Posts: 2261
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Post by NVTeam » Thu Sep 15, 2011 9:20 pm

Yes, it is quite linear, the larger the frame the longer it takes.

Vlad
Image Image Neat Video team
noise reduction for video and photos

Pyramid Pyro
Posts: 8
Joined: Wed Dec 08, 2010 6:32 pm

Definite Gains

Post by Pyramid Pyro » Tue Oct 11, 2011 12:09 am

I'm a long time (about 1 year or so) NEAT video user and fan. I've been very impressed with the v2 quality and results. The big complaint was always the slow timing. v2 never tood advantage of CUDA. We spoke up. The NEAT team answered.

I just now (wish I would have known about the release sooner) downloaded the v2 to v3 upgrade. After updating the settings I ran a quick test on my machine equiped with PPro 5.0.3 and a GTX 470. Processor is i7 930 stock with 12GB of ram. I'm using 4-5 discs if anyone cares.

Where the GPU never kicked in on export with v2 I'm now seeing it shoot up to 50% usage with v3. This is great. As a result, my export times seem to have dropped by 50% or more. I will do more in-depth testing later, but the CUDA feature is a definite winner. Hands down, there is a huge improvement on export time with NEAT applied.

I would say if you like the export quality and hate the export times, then the upgrade is worth it.

I have not done side-by-side quality comparisons.

Good job guys,

Erik

Edit: my source footage is regular ol' HDV.

jpsdr
Posts: 196
Joined: Mon Aug 11, 2008 7:33 am

Post by jpsdr » Tue Oct 11, 2011 7:52 am

Personnaly, as i have a very good CPU (i7@980), i've until now, with the few cards i've tested, always had better results with cpu only.
On the other hand, i haven't tested with high brand cards.
From my testing, CUDA is not interesting with already very good cpu.

NVTeam
Posts: 2261
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Post by NVTeam » Tue Oct 11, 2011 2:05 pm

Here are some absolute figures from Neat Video's Optimize (frame: 1920x1080p, 8 bits per channel, radius: 1 frame) running on a bit faster cards:

- GPU only (GeForce GTX 470): 6.41 frames/sec

- GPU only (GeForce GTX 580): 10.1 frames/sec

- GPU only (GeForce GTX 590 #1, GeForce GTX 590 #2): 13.5 frames/sec

For comparison, running on i7 2600, 3.4GHz, fast RAM:
- CPU only (4 cores): 6.41 frames/sec

Hope this helps,
Vlad
Last edited by NVTeam on Tue Oct 11, 2011 5:31 pm, edited 1 time in total.
Image Image Neat Video team
noise reduction for video and photos

Pyramid Pyro
Posts: 8
Joined: Wed Dec 08, 2010 6:32 pm

More Accurate Results

Post by Pyramid Pyro » Tue Oct 11, 2011 4:13 pm

jpsdr: indeed that is a very good CPU. Do you have any timing numbers and the GPU model?

For all to review,

I ran some more detailed tests today. Well, as detailed as I could get. Here are some findings (same machine as mentioned before).

v2: I can't find any specifics but if I remember correctly the export times with NEAT applied all the way through would yield 12x the sequence length. So with that estimation, a :30 would take 6:00 to export. I should have run a test prior to the v3 upgrade to get more accurate figures. Maybe someone else can point out my old performance post.

v3 CPU/GPU: :30 clip with NEAT applied all the way through took 3:04.

v3 CPU only: :30 clip with NEAT applied all the way through took 4:52.

The way I read this is that v3 is a great improvement over v2 for my machine. Export times are reduced by 50% using improved CPU utilization plus new CUDA features. I may try some CPU overclocking and maybe even OC the GPU (maybe).

Your mileage may vary...

Best of luck to all :)

Erik

jpsdr
Posts: 196
Joined: Mon Aug 11, 2008 7:33 am

Post by jpsdr » Wed Oct 12, 2011 8:28 am

I've build a PC dedicated to video processing, so, i'm absolutely not interested with 3D capacity of Video card.
My standard video card is an ASUS ENGT520 SILENT/DI/1GD3(LP) 1 GB.
Poor bandwith for what i know, but i don't mind, what interest me is the fact this card has (and for now only 520 has) the new VP5 engine, wich give excellent result in decoding h264 video (i'm using neuron2's dgdecodenv for working with blu-ray materials). So, result with this card is realy poor.

I've try, adding a second card : MSI N450GTS-MD2GD3.
My motherboard allow me until 3 cards, so, 1rst card is always x16, and
configurations of 2 others can be x8/x8 or x16/x1. Of course, i've choosen x16/x1, wich allowed me to have both cards in x16.
Best result was also with CPU only.
All configurations tested (card1 only, card2 only, card1 & card2).

If i remember properly, under windows XP, on my i7@980, with VD pluggin :
- Best result were with around 7 or 8 CPU, not 12.
- 576p, radius 2, around 25-30fps (don't remember exactly).
- 1080p, radius 2, around 4-5fps (don't remember exactly).

Vlad, can you provide, for reference, benchmark with a radius of 2 of VD pluggin under windows XP ?

NVTeam
Posts: 2261
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Post by NVTeam » Wed Oct 12, 2011 10:41 am

Cannot measure them directly right now.
My estimation is that for the radius 2 the figures would be about 20% lower, like 8fps (1080p) and 30fps (576p) for GTX 580 (GPU only).

Vlad
Image Image Neat Video team
noise reduction for video and photos

jpsdr
Posts: 196
Joined: Mon Aug 11, 2008 7:33 am

Post by jpsdr » Wed Oct 12, 2011 5:39 pm

Ok, i've remade tests :
Windows XP64 SP2, with VDub64 and 64bit plugin, using benchmark tool provided in NV. Core i7@980, no overclocking, P6T Deluxe/OC palm mother board, DDR3 Kingtsone XMP FSB1800 memory.
radius 2 : 720x576p : best result 8 cores (on 12) with 22.2fps
radius 1 : 720x576p : best result 8 cores (on 12) with 25fps
radius 2 : 1920x1080p : best result 8 cores (on 12) with 4.48fps
radius 1 : 1920x1080p : best result 8 cores (on 12) with 5.1fps
... What's surprise me is the fact that with an i7 2600 4 cores, Vlad get better results with CPU only...
Vlad, can you somehow do the same tests with the exact configuration (except for CPU) than me ? => XP62 SP2 + VDub64 + plugin 64 bits ?
Unless it's a know fact that i7 2600 3.4GHz is realy better than an i7@980 3.33GHz, there is realy something odd.
Could problem be :
- Cause by OS ?
- Cause by plugin version (Vdub) ?
- ..... ..... .... ..... .... what else... ... ... "bad" handling of hyperthreading ?

As far as i know, from benchmark i see on Doom9, i've excelent x264 speed, so, in the 1rst time, i would exclude something from the HW.

NVTeam
Posts: 2261
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Post by NVTeam » Wed Oct 12, 2011 6:29 pm

Sorry, cannot check in XP-64. In Win7-64, on i7 2600 3.4GHz using the current build of the v3 plug-in from the website:

1920x1080p:
radius 1: 7 cores, 5.85 frames/sec
radius 2: 7 cores, 5.26 frames/sec

720x576p
radius 1: 5 cores, 28.6 frames/sec
radius 2: 6 cores, 25.6 frames/sec

I double-checked the Max Turbo Frequency setup, it turns out that earlier, the i7 2600 automatically switched to 3.8GHz under load, which explains somewhat higher figure earlier (6.41 fps earlier vs 5.85 fps now). I have now fixed its speed at 3.4GHz to do these measurements for you.

Hope this helps,
Vlad
Image Image Neat Video team
noise reduction for video and photos

jpsdr
Posts: 196
Joined: Mon Aug 11, 2008 7:33 am

Post by jpsdr » Wed Oct 12, 2011 7:20 pm

I've found a test (in french), wich compare i7 2600 and i7 980X. Result show a little better result for i7 980X for multi-threading test, and similar results otherwise. NOT having an X version, my results may be normal.
Still curious to know if OS has an effect...

patrick
Posts: 8
Joined: Tue Dec 04, 2007 10:56 pm

Re: More Accurate Results

Post by patrick » Fri Dec 02, 2011 7:54 pm

Pyramid Pyro wrote:with that estimation, a :30 would take 6:00 to export.

v3 CPU only: :30 clip with NEAT applied all the way through took 4:52.
so there's a 19% improvement for ATI users. Thanks for benchmarking.

Really hope for OpenCL support. I have a i7 920 (12GB) with a HD6950 (2GB). I'm sure GPU acceleration would help quite a bit.

mathewlisett
Posts: 56
Joined: Fri May 28, 2010 8:51 pm

Post by mathewlisett » Sat Mar 31, 2012 12:58 pm

since it seems this topic is literally on topic with ym question ill ask here.

with NV v3 in sony vegas 11 pro, what card specs shoudl i be looking at for real time playback.

NVTeam
Posts: 2261
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Post by NVTeam » Sat Mar 31, 2012 1:14 pm

Speed depends on frame size (and filter settings) and for certain sizes realtime may be achievable with multiple cards only.

Anyway, I recommend to take a look at this topic for additional information about performance of different cards.

Vlad
Image Image Neat Video team
noise reduction for video and photos

Post Reply