Random thoughts, trials & tribulations: rendering speed

general questions about Neat Video
Post Reply
Zach
Posts: 38
Joined: Sat Jun 01, 2013 12:37 pm

Random thoughts, trials & tribulations: rendering speed

Post by Zach »

This initially started as a question about how the benchmark stats are calculated..
But I expanded on it just by sharing my own experiences and there may be a question or two scattered amongst the wall of text this post turned into...


I'm not necessarily complaining or anything, as I'm just kind of curious.

But I have noticed somewhat larger than expected differences in benchmark vs real world results. I'm not expecting the same numbers, as real-world usage is never so easy to quantify, but still.

Running the optimizer gives me benchmarks of around 22 FPS with GPU + 4 cores @ 1920x1080 (radius 1 or disabled, I don't recall atm)

Although in the real world I've always maxed out at around 10fps. I've done what tweaking I could to try and improve it, but nothing seems to work towards improving speed.

Based on some quick math I average around 6 - 7 fps piping straight through to x264 (fed by Avisynth) with real-time filtering in the chain. Which is not at all that bad (when working on Anime at least, and compared to CPU heavy AVS scripts that crawl at fractions of a frame per second).

I have noticed some (in appearance) differences by using different source filters, (DirectShow vs FFmpeg Input drivers for Virtualdub), possibly even with x64 being slightly slower (but that is somewhat anecdotal).

I think for whatever reason, I saw the best performance when using the DirectShow input plugin for Virtualdub (x32 at the time) and saving out to a lossless UT Codec intermediate to be encoded later. I think I may have tested it fed directly to x264 but I can't remember.. But under Dshow I think I saw performance as high as 12 - 13 fps.

But I got paranoid about colorspace conversions (Dshow input flags as the colorspace as of BluRay source files RGB888, incidentally so does opening a yuv420 ULH0 AVI directly - no idea why) and seeking is possible but may or may not be frame accurate ( which also makes me paranoid). FFinput is supposed to be better with frame accuracy, but seeking is a nightmare (you can go forward but if you jump it will lock-up Vdub). But it DOES correctly import YV12 AVC streams...

So I made a trade off of rendering to lossless and then encoding. But given the amount of time it takes for both steps (minus profiling) I did a test and even at a lower sustained FPS, its easier to just load NV with an Avisynth script, with the appropriate profiles and encode directly with x264..

So interestingly enough, you can't always assume that rendering to a lossless file, will get you the fastest encoding times even though you would think freeing up more of the CPU would help with the end-of-chain encoder!

So now I have a brand new work flow it would seem!

Using Virtualdub x64 with ffinput driver, I open up the transport streams previously saved to my HDD. ->

Then I render them out to an AVI with UT Codec yuv420/bt.709 - which become insignificant for the time spent because I do it in batches. It's super fast at around 60 - 70 fps anyway so a whole season (12 - 15 episodes) only takes less than 2 hours. ->

Open those files and build profiles for them with Neatvideo, since Vdub can freely seek without issues (primary reason for the intermediate at this step) ->

Create & Queue up an Avisynth script in MeGUI, modify them to add stuff and send it off to the encoding queue to feed the intermediate AVI through Avisynth -> Neatvideo -> x264. Wait a little while & Prosper with nice clean video.
NVTeam
Posts: 2745
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Re: Random thoughts, trials & tribulations: rendering sp

Post by NVTeam »

Zach wrote:differences in benchmark vs real world results.
We should not forget that Neat Video's built-in benchmark measures the speed of the filter itself, meaning this one component:
->[Neat Video]->

while the real world rendering in VD or any other host application involves the following sequence of components:
[Input Clip on Disk]->[Input Codec]->[Host Application]->[Neat Video]->[Any Other Filters]->[Output Codec]->[Output Clip on Disk]

Since rendering in VD involves many more components than Neat Video alone, such real world rendering is always slower than any single component from the above chain. That is normal and to be expected.

Yet using Neat Video's benchmark is useful to optimize that one specific component to the hardware it runs on. That is the purpose of Neat Video's benchmark.

Vlad
Zach
Posts: 38
Joined: Sat Jun 01, 2013 12:37 pm

Post by Zach »

I can understand this to a certain extend.

But at the same time, I still have a hard time "accepting" such low numbers compared to what NV reports.

Its interesting you bring up the filter chain. I can save out with no filters, to a lossless file (I decided it was better to do this than to try profiling with FFinput, as I can just profile off a seekable AVI); so with no filters in the chain, I can save my source stream as fast as 60+ FPS.

So I know its not a bottleneck with my hardware per say. But simply adding Neatvideo to that chain will again get me around 10 FPS max. I just don't see how a single filter can bring the chain to such a slow crawl? Especially if the only thing after it is a YV12 conversion.

Surely a the colorspace conversion (YV12-> RGB32 / Filter / RGB32-> YV12) can't be putting that much of a hit on the system? Speaking of which, are there any plans to add support for YV12 or other planar formats? I mean, YV12 just kind of makes sense because it is the most popular consumer format.

I guess that's just how things are. Like I said I'm not really disatisfied, I've been used to slow render times for years, especially from Avisynth filters. So for the job it does, NV is pretty acceptable to me.
NVTeam
Posts: 2745
Joined: Thu Sep 01, 2005 4:12 pm
Contact:

Post by NVTeam »

I have just done a quick test:
- running a small (720x576) clip in VD, applying the null transform and exporting to uncompressed file (the whole rendering workflow except Neat Video): 118 fps
- running Neat Video's Optimize (radius 5) gives the best result of 23.3 fps (the speed of Neat Video alone);

Calculating the theoretical overall speed of the whole rendering workflow with Neat Video included: 1/(1/118+1/23.3) = 19.5 fps

Running the actual render gives: 18.7 fps

The difference between the theoretical value of 19.5 fps and actually measured value of 18.7 fps is not really large. It could be explained for example by those additional color space transformations done by VD. In any case, I do not see any significant discrepancy. The measurements provided by Optimize seems to be adequate.

Regarding transformations between YV12 and RGB, they would have to be done anyway, because the filter needs some data in RGB.

Vlad
Zach
Posts: 38
Joined: Sat Jun 01, 2013 12:37 pm

Post by Zach »

I can only say that I look forward to the continued development of NeatVideo, and hope that the future will bring some speed improvements to high resolution content.
Post Reply