i have an image of 1.5GB which i would like to scale down by Super interpolation, on Intel(R) Xeon(R) CPU E5-2460 v3 @ 2.66Ghz (2 processors) 32 cores and memory of 128 GB.
Using Intel 2019 and 2020, i see that the more threads i use, the slower it takes to scale down using Super interpolation. While testing it on ippiu8-5.2.dll (i don't which Intel it is...) i get faster performance when i use more threads. The problem doesn't exist for Cubic interpolation. It works as expected.
The sample below shows the performance time to scale down image of 1.5GB with Super interpolation by factor of 0.27 using different number of threads. Each case was tested 3 times:
Using Intel 2019 and 2020:
threads = 4, time=842 ms, threads = 4, time=670 ms, threads = 4, time=655 ms
threads = 8, time=718 ms, threads = 8, time=718 ms, threads = 8, time=749 ms
threads = 16, time=967 ms, threads = 16, time=920 ms, threads = 16, time=921 ms
threads = 24, time=1201 ms, threads = 24, time=1092 ms, threads = 24, time=1170 ms
.Using old version of Intel (ippu8-5.2.dll):
threads = 4, time=1092 ms, threads = 4, time=1123 ms, threads = 4, time=1092 ms
threads = 8, time=577 ms, threads = 8, time=562 ms, threads = 8, time=562 ms
threads = 16, time=375 ms, threads = 16, time=375 ms, threads = 16, time=374 ms
threads = 24, time=249 ms, threads = 24, time=249 ms, threads = 24, time=265 ms
Any solution to get better results when using more threads in Intel 2019 and 2020 for Resize Super Interpolation?
thanks for your efforts.
it seems much better and acceptable.
it seems that there is no much processing time difference when number of threads is greater than 8 (see table below). I also added test for 1 thread as well.
Nevertheless, it seems fine. I will do more tests and let you know if i find something weird...
I tested your modification (that split the image to parts) on different image size and found something weird.
see in below table the processing time in milliseconds for 1,4,8,16,and 24 threads running on two images with different rows.
I expected that the processing time for the smaller image will be faster than the bigger one. But i results are opposite and not as expected.
What could be the reason and how to solve it?
|image size 1.62GB (30720x18924)||656||188||94||93||109|
|image size 1.09GB (30720x12682)||500||234||188||218||360|
Unfortunately I haven't reproduced your issue yet. In my environment I see that the smaller image is processed a little bit faster than the bigger one with new parallelization scheme. But I saw the same behavior (bigger is faster) for the initial scheme. Could you please double-check that the both images processed with the same parameters with new scheme?
all tests executed your code of CImage::ResizeMod(...). I always get faster performance for the bigger image. Is there any option to let you connect to my computer by TeamViewer to show you the problem? Or, where can i upload the images for you so you will test them?
if you like the official support and you have a valid license, then you may submit the issue to the official Intel Support Center -
There You may open tickets and upload all your images. If I am not mistaken, this system has a 2Gb Size Limitation.
or you may upload all of your images to the external resources ( DropBox, as an example) and share the link.
see shared link to download 2 images that i tested
the scale-down for the bigger one is faster than the smaller one. I expect to be vice versa...
It is highly recommended to transfer data through Intel Support Center https://supporttickets.intel.com/servicecenter?lang=en-US or through open channels that don't require sharing of confidential information.
Request Support -> Check "A product or service I already own or use" + "Search for a product or service by name" -> type "Integrated" and chose Integrated Performance Primitives -> Answer the questions and shortly describe question, push "Next: Details" button -> Fill the form and "Submit Request". After submitting the request you'll be able to upload large files up to 2Gb
I was able to reproduce your behavior when big image processed faster than small one with your images which sizes are 30720x18284 and 30720x12682. Now I'm looking for what can cause such behavior.