- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
I've recently found out a big performance drop when using NV12 video output with EVR/EVR-CP renderers and Intel's GPUs.
I have done some performance tests using DXVA Checker v3.9.0 on a Win 10 x64 system using Core i7-4790 and Intel's iGPU HD4600 with latest drivers v.4279.
As video decoder, I used LAV filters 0.66.31, which is the default video decoder for many Video Players, like MPC-HC.
The test clip is here, a small video sample from YouTube:
https://www.sendspace.com/file/o0oe5t https://www.sendspace.com/file/o0oe5t
It's a 4K60fps VP9 clip.
Results:
Decode - Renderless mode (No renderer)
Video output min fps/ average fps/ max fps
NV12/YV12 83 fps/112 fps/175 fps CPU usage 72%
Both NV12 and YV12 have the same performance.
Playback - Vanilla EVR (scale to 1280 x 720)
Video output min fps/ average fps/ max fps
YV12 79 fps/100 fps/139 fps CPU usage 73%
NV12 65 fps/74 fps/86 fps CPU usage 52%
The numbers represent min fps/average fps/max fps of the whole decoding process.
It is clear that using a renderless mode, in pure decoding, NV12 and YV12 have exactly the same performance.
BUT during real playback, using EVR renderer, the performance drop of NV12 is more than clear compared to YV12 along with the CPU usage.
There is a bottleneck somewhere in the driver that prevents GPU and CPU to reach maximum speed.
And because NV12 is the default, quality video output format for many video decoders like LAV filters and the preferred output even for HW acceleration of Microsoft's DXVA, I think is urgent to do something about it and increase the performance of that video output.
Waiting for your feedback.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would just like to mention that I also confirmed this performance issue for NikosD previously via the iGPU on my own Pentium G3258 (overclocked to 4.6GHz) on Windows 7 64bit.
For me it was a difference of 42fps vs 56fps (both with 100% CPU utilization); as someone that watches a great deal of 50fps content (yes 50, not a typo), that would be the difference between perfectly fine and completely unacceptable performance.
UPDATE: I can also confirm that, on my old Intel 965GMA integrated graphics, the performance of NV12 matches the performance of YV12.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
UPDATE 1:
I tried an ancient laptop with Win 10 x64 - iGPU 945GM (GMA 950) and the results were the same like 965.
Same performance for NV12/YV12
UPDATE 2:
I tried my SandyBridge system Win 10 x64 - Core i5 2400 - iGPU HD 2000 - v4229 and Radeon 5750 - Catalyst 15.10
The system has two graphics cards, so I tested both on the above clip 4K VP9
Decode (renderless)
The performance is the same using Intel or ATI and exactly the same using NV12/YV12.
In all cases the average fps is 84fps.
Playback (vanilla EVR renderer - scale to 1280x720)
Radeon 5750
NV12/YUY2 same performance like decode (renderless) mode ~84fps
Intel HD 2000
YV12 50/60/66 CPU usage 69%
NV12 42/51/64 CPU usage 86%
Even though NV12 for SandyBridge HD 2000 uses a lot more CPU than YV12, still doesn't catch YV12's performace.
From the above tests it is obvious that older Intel GPUs (like 945, 965) don't suffer from performance hit using NV12, but from SandyBridge and onwards the performance hit is clear when NV12 is used, instead of YV12.
Still, waiting for your feedback.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello NikosD,
Thank you for sharing your testing results.
Regards,
Amy.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can confirm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Nikos,
I'll get the investigation started on this one. I'll be in touch as needed for updates/questions. Thanks for raising this.
.:Bryce:.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Bug submitted and in queue for investigation. Will follow up as updates come. Thanks.
Bug# 10003541
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your prompt actions!
Hope we have good news soon.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Correction...
Update:
Still work-in-progress on debugging root cause. Stay tuned...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello.
I have a few indications showing me that the problem has been - somehow - fixed, but not completely.
For Haswell owners, latest drivers for Win 7/8/8.1 and Win 10 seem to have the same performance using NV12 and YV12.
I think the same goes to Broadwell and Skylake, but I don't know about Ivy and SandyBridge, which I think don't have updated drivers.
BUT even for my Haswell using 4326 latest driver for Win 10, I 've seen some "ugly" scaling algorithms using NV12, like those of YV12.
It seems that the new drivers somehow "cheat" in quality (scaling algorithms and others) in order to achieve the same performance as YV12, which already had lower quality and possibly that was the reason to be faster.
We, as users, want both. Speed and previous quality using NV12.
Hope it helps in order to further investigate the issue.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page