QuickSync performance on p580


I use Supermicro hardware for live transcoding (with Quick Sync acceleration). In testing environment I use Intel Xeon E3-1275 v5 with p530 gpu (MBI-6119G-C2) and in production Intel Xeon E3-1275 v5 with p580 (MBI-6119G-T8HX) is planned.

During my tests i found that p530 is performing better than p580. 

I use latest ffmpeg v4.1 for Windows with simple simple command:

ffmpeg.exe -i input.mp4 -c:v h264_qsv -vf "scale=640x360" output.mp4 

On P530 this encodes @ 500 fps (speed 20x - ffmpeg)

On P580 this encodes @ 160 fps (speed 6x -ffmpeg)

When scale is not configured i get the same speed on p530, but p580 gets a bit better result @300 fps  (speed 12x - ffmpeg) - still falling behind p530 performance.

OS: Windows Server 2016 (patches on both hardware’s are identical) 
Video driver version: the same (also tried latest)

Checked BIOS settings - they are +/- identical. 

Any ideas why I’m getting these kinds of results?

