- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
i have an image of 1.5GB which i would like to scale down by Super interpolation, on Intel(R) Xeon(R) CPU E5-2460 v3 @ 2.66Ghz (2 processors) 32 cores and memory of 128 GB.
Using Intel 2019 and 2020, i see that the more threads i use, the slower it takes to scale down using Super interpolation. While testing it on ippiu8-5.2.dll (i don't which Intel it is...) i get faster performance when i use more threads. The problem doesn't exist for Cubic interpolation. It works as expected.
The sample below shows the performance time to scale down image of 1.5GB with Super interpolation by factor of 0.27 using different number of threads. Each case was tested 3 times:
Using Intel 2019 and 2020:
threads = 4, time=842 ms, threads = 4, time=670 ms, threads = 4, time=655 ms
threads = 8, time=718 ms, threads = 8, time=718 ms, threads = 8, time=749 ms
threads = 16, time=967 ms, threads = 16, time=920 ms, threads = 16, time=921 ms
threads = 24, time=1201 ms, threads = 24, time=1092 ms, threads = 24, time=1170 ms
.Using old version of Intel (ippu8-5.2.dll):
threads = 4, time=1092 ms, threads = 4, time=1123 ms, threads = 4, time=1092 ms
threads = 8, time=577 ms, threads = 8, time=562 ms, threads = 8, time=562 ms
threads = 16, time=375 ms, threads = 16, time=375 ms, threads = 16, time=374 ms
threads = 24, time=249 ms, threads = 24, time=249 ms, threads = 24, time=265 ms
Any solution to get better results when using more threads in Intel 2019 and 2020 for Resize Super Interpolation?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
I tested your modification (that split the image to parts) on different image size and found something weird.
see in below table the processing time in milliseconds for 1,4,8,16,and 24 threads running on two images with different rows.
I expected that the processing time for the smaller image will be faster than the bigger one. But i results are opposite and not as expected.
What could be the reason and how to solve it?
1 | 4 | 8 | 16 | 24 | |
image size 1.62GB (30720x18924) | 656 | 188 | 94 | 93 | 109 |
image size 1.09GB (30720x12682) | 500 | 234 | 188 | 218 | 360 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dudi,
I'm looking into this case. Will notify you when results are available.
Best regards,
Vlad V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
Did you manage to reproduce the problem?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dudi,
Unfortunately I haven't reproduced your issue yet. In my environment I see that the smaller image is processed a little bit faster than the bigger one with new parallelization scheme. But I saw the same behavior (bigger is faster) for the initial scheme. Could you please double-check that the both images processed with the same parameters with new scheme?
Best regards,
Vlad V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
all tests executed your code of CImage::ResizeMod(...). I always get faster performance for the bigger image. Is there any option to let you connect to my computer by TeamViewer to show you the problem? Or, where can i upload the images for you so you will test them?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dudi,
if you like the official support and you have a valid license, then you may submit the issue to the official Intel Support Center -
https://supporttickets.intel.com/servicecenter?lang=en-US
There You may open tickets and upload all your images. If I am not mistaken, this system has a 2Gb Size Limitation.
or you may upload all of your images to the external resources ( DropBox, as an example) and share the link.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
see shared link to download 2 images that i tested
the scale-down for the bigger one is faster than the smaller one. I expect to be vice versa...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dudi,
Unfortunately, the link you shared goes to the private sharepoint that requires authentication and we have no access to it.
Best regards,
Vlad V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Vlad,
could you please write your e-mail address so i will send by WeTransfer?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dudi,
It is highly recommended to transfer data through Intel Support Center https://supporttickets.intel.com/servicecenter?lang=en-US or through open channels that don't require sharing of confidential information.
Best regards,
Vlad V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Even though signing-in, I don't understand how to transfer data through Intel Support Center...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Request Support -> Check "A product or service I already own or use" + "Search for a product or service by name" -> type "Integrated" and chose Integrated Performance Primitives -> Answer the questions and shortly describe question, push "Next: Details" button -> Fill the form and "Submit Request". After submitting the request you'll be able to upload large files up to 2Gb
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
I uploaded 2 images according to your advise.
You can find them by support request number: 04986331
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dudi,
could you please try to upload these images once again, as we see no images uploaded to those online service center thread?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
i uploaded again the images. So you can find them in support request number: 04986331
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dudi,
I was able to reproduce your behavior when big image processed faster than small one with your images which sizes are 30720x18284 and 30720x12682. Now I'm looking for what can cause such behavior.
Best regards,
Vlad V.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
thanks for the information. Reproducing the problem is 50% of the solution
Regards,
Dudi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vald,
i hope you could find out the reason for the slow down.
do you have any update?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dudi,
Unfortunately, the reason is still not clear. There are much more L3 cache misses for smaller picture in your example within 'parallel_for' loop and I'm looking what can cause such behavior.
Best regards,
Vlad V.
- Tags:
- Hi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vlad,
do you have any solution in mind to solve this issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dudi,
Yes, the problem again is in the amount of additional buffer size. Because of the calculations of how many rows to process at a time, for small image it is required more memory and competition for L3 cache happens with larger number of threads. The solution can be to split image not only by rows but also by columns to reduce overall memory load for processing one picture piece. Now I'm testing it on your example code and will notify you about results when they're ready.
Best regards,
Vlad V.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page