- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We recently noticeda common forum topic regarding ippiResize* function, I would like to share the following information with forum participants, please reply this thread if you have anyadditional comments.
First, start Intel IPP v6.0, there are several APIs deprecated including ippiResize(), please visit this article in Intel IPP Knowledge Base for more details.
Second, there is a known issue on new image resize function ippiResizeSqrPixel() in current Intel IPP v6.0, please check this article "Resize function ippiResizeSqrPixel() crashed for small image" for more reference. It also includes a C code sample on ippiResizeSqrPixel() usage.
Additionally,you may take care of the parameter [src/dst]Stepr. Some unexepcted error may come from wrong step value. For example, the stepBytesmay not always equal tochannel*Width.
It may be ((nChannel*srcWidth+3)>>2)<<2 if a bmp image with4 bytes-aligned ora multiple of 32 when use ippiMalloc.
Best Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
as Ying have mention before, high cpu usage usually corresponds to better performance. If you will count number of cpu clocks spent in ippiResize and ippiResizeSqrPixel you may notice that the second function takes less clocks to do the same work. That basically left more time for other processing you may do in parallel.
Note, if you do threading on top of IPP (i.e. application has several threads calling IPP functions in parallel) itmay make sense to disable IPP internal threading with ippSetNumThreads(1) call to avoid threads oversubscription.
Regards,
Vladimir
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The deprecated ippiResize did not require one; did it allocate internally, or was it more efficient in this regard?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Each language (C++, C#, Delphi, other), each platform has own memory manager. An user will control the allocation of memory in your applicationitself usingthe function with external buffer (ippiResizeSqrPixel). Therefore usage of such functionality is more preferable, than usage old (ippiResize) with internal allocation and clearing of the memory.
Thanks,
Beg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am new to Ipp community. I was using OpenCV for Image Processing software development. But, due to limitation of 8 bit(my image data is of 16 bit ), I started trying Ipp from last week with Trial version. I am using the function
ippiFilterMedian_16u_C1R for median filter.
Regarding your suggestion,
"Additionally,you may take care of the parameter [src/dst]Stepr. Some unexepcted error may come from wrong step value. For example, the stepBytesmay not always equal tochannel*Width.
It may be ((nChannel*srcWidth+3)>>2)<<2 if a bmp image with4 bytes-aligned ora multiple of 32 when use ippiMalloc.",
I have a query for my 16 bit single channel data as set in my program.
int dstStep = dstWidth * 2; // 1 WORD size = 2 BYTE, wheredstWidth = srcWidth - (nKernelWidth - 1);
Am I doing correctly? Pls. suggest.
The function build is OK and it is working fine.
Thanks in advance.
John
We recently noticeda common forum topic regarding ippiResize* function, I would like to share the following information with forum participants, please reply this thread if you have anyadditional comments.
First, start Intel IPP v6.0, there are several APIs deprecated including ippiResize(), please visit this article in Intel IPP Knowledge Base for more details.
Second, there is a known issue on new image resize function ippiResizeSqrPixel() in current Intel IPP v6.0, please check this article "Resize function ippiResizeSqrPixel() crashed for small image" for more reference. It also includes a C code sample on ippiResizeSqrPixel() usage.
Additionally,you may take care of the parameter [src/dst]Stepr. Some unexepcted error may come from wrong step value. For example, the stepBytesmay not always equal tochannel*Width.
It may be ((nChannel*srcWidth+3)>>2)<<2 if a bmp image with4 bytes-aligned ora multiple of 32 when use ippiMalloc.
Best Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
Could you tell how do you creat/load or store your dst image data ?
for example, if you use OpenCV API cvCreateImage() to create a dst image,
int dstWidth=7;
int dstHeight= 1;
CvSize tempSize={dstWidth,dstHeight};
dst = cvCreateImage( tempSize, 16, 1 );
Then therow ofdstis 4 byte aligned, thus dst->widthStep=16, not 7x2=14, so to call dst->widthStep may be safe here
If you are using malloc()
dst= (short *) malloc(dstWidth*dstHeight*sizeof(short));
Then the dstStep is dstWidth*2.
Best Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have noticed that when calling the function ippResize or ippResizeCenter recursively and sequentially with small delay (< 100 ms) in between, uses less CPU usage as compared to ippResizeSqrPixel. The comparison is based on resizing 1280x1024 image to 877x602 image. We have tried using the single algorithm or parallel algorithm introduced in the documentation but it still uses alot of CPU usage in both algorithms. We cannot figure out what's wrong with ippResizeSqrPixel function.
Can anyone please enlighten on this issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have noticed that when calling the function ippResize or ippResizeCenter recursively and sequentially with small delay (< 100 ms) in between, uses less CPU usage as compared to ippResizeSqrPixel. The comparison is based on resizing 1280x1024 image to 877x602 image. We have tried using the single algorithm or parallel algorithm introduced in the documentation but it still uses alot of CPU usage in both algorithms. We cannot figure out what's wrong with ippResizeSqrPixel function.
Can anyone please enlighten on this issue?
Hello,
One quick thought, the delaymay berelated tothe muil-thread (threads overhead). Could you please attach a small test code (especially the calll order and time measure)so we can reproduce the problem?
Thanks
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Galdly! ;-) Please locate the 2 C++ attached source files. I am using OpenCV 1.0 to load the image into buffer and run the various resize function in IPPLibTester class. I am using IPP 6.1.1.035 library to run this example.
Regards,
I2R D&T Team
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you managed to resolve the issue I raised?
Regards,
I2R D&T Team
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you managed to resolve the issue I raised?
Regards,
I2R D&T Team
Hello I2R D&T Team
Sorry, I happen to notice your reply till now.Could you please tell me more informations, like
1. the IPP version andthe library you are linking, your OS and CPU type?
2. And how do you measure your observation, by task manager : cpu usageor other time measure funtions?
But generally, it is expected that high performance corresponding high cpuusage. What is your concerns aboutippResizeSqrPixel? it take more times than ippiResizeor CPU usage is high?
Additionally, about the parallel code, if you are calling ipp 6.1 and dynamic library, the ippiResizeSqrPixel is threaded by OpenMP internally. So in most cases, you don't need to write OpenMP parallel code to wrapper it again. IPPwill disable the IPP internal thread if it detectexternal parallel region. Is there any reason for your create OpenMP thread by yourself?
Best Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ying Hu,
Thank you for your reply. I thought my concern will ends with no answer from here.
The program I am working uses IPP 6.1.1.035 and also links to OpenCV 1.0 for image retrieval. The OS and CPU I am using is Win XP 32bit, running on Intel Xeon E5430 2.66Ghz with 3GB memory. I observe the high CPU usage from task manager for the process I am using the example. The time measurement for the resize functions are calculated in the program.
My concern is the unusual high CPU usage for ippiResizeSqrPixel function that ippiResize function does not produce. I am testing using OpenMP and non-OpenMP version to test if the ippiResizeSqrPixel function does exhibits high CPU usage and indeed it does.
I understand that ippiResizeSqrPixel function takes over ippiResize function because it has more variety of resize techniques under one function. However, if the ippiResizeSqrPixel function exhibits high CPU usage, it is not attractive and practical to use. Performing image resizing is handy in many applications we are developing but it just one part of the whole operations running in parallel. Having high CPU usage for a single opeartion slows down significantly other parallel operations in our design. In this case, we have to switch back and use the deprecated ippiResize function.
As long as the ippiResize function is still supported in future IPP releases, this issue is just a concern. You understand my point here?
Regards,
I2R D&T Team
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
as Ying have mention before, high cpu usage usually corresponds to better performance. If you will count number of cpu clocks spent in ippiResize and ippiResizeSqrPixel you may notice that the second function takes less clocks to do the same work. That basically left more time for other processing you may do in parallel.
Note, if you do threading on top of IPP (i.e. application has several threads calling IPP functions in parallel) itmay make sense to disable IPP internal threading with ippSetNumThreads(1) call to avoid threads oversubscription.
Regards,
Vladimir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vladimir,
Thank you for rehighlighting the point of high performance usually corresponds to high cpu usage.
If I may and interests to ask which software did you used to get the number CPU clocks executed by a specific function?
Regards,
I2R D&T Team
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi I2R D&T Team,
We usually use IPP function ippGetCpuClocks() to get CPU clocks.
for example,
Ipp64u start = ippGetCpuClocks();
for (int k=0;k<1000; k++)
//ipplib->map_deprecatedCenter((byte*)img->imageData, img->widthStep, szSrc, destImg, nDestStride, szDest);
ipplib->map_serial((byte*)img->imageData, img->widthStep, szSrc, destImg, nDestStride, szDest);
//Sleep(99);
//}
Ipp64u end = ippGetCpuClocks();
double ippCPE= double(end-start)/(szDest.width*szDest.height*3*1000.);
double ippClock= double(end-start)/(double(pMhz)*1000. *100.0);
printf( " IPP cpe = %gn", ippCPE );
printf( " IPP time = %gmsn", ippClock );
I did a test on my lapatop with your test program.
Core2Duo-2.0GHz, T7300, 2 cores, 1.96RAM, linking 32bitdynamic library, ippi.lib ippcore.lib
ippiv8-6.1.dll 6.1 build 137.20 6.1.137.809
map_deprecatedCenter :
IPP cpe = 42.6039
IPP time = 338.239ms
calculating time = 338.413ms
cpu usage= 50%, TAKE FULL ONE core
map_serial:
IPP cpe = 32.0714
IPP time = 254.62ms
calculating time = 254.752ms
cpu uage=100%, take full two core.
Aswe mentioned, the ippiReiszeSqrPixel is threaded internally by OpenMP, it will start 2 threads on 2 core machines automaticlly. You don't need to write thread code for calling the function. And if you'd like to reduce the CPU usage and use only one thread, you can set ippSetNumThreads (1) before call ippiReiszeSqrPixel().
There are someerrors in themap_deprecatedCenter,i did a little modification and attach the modified code here for your reference.
(For the same conditions we must specify the coordinates of the center in the middle of dst roi.
Otherwise it will be processed the non-whole image. I change
x = (int)(szDest.width / 2.0);
y = (int)(szDest.height / 2.0);)
Best Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We would be using the non-threaded version due to higher-level threading in our software. What would the performance numbers be for ippiReiszeSqrPixel ('map_serial') when just using one thread (maxing out one core)?
In my view - and for our usage pattern - it is still essential that new functions replacing older ones also perform at-least comparable to (preferably better than) the functions they replace for single-threaded use. The use of multi-threading should be an option (as it also is) that just enables higher performance in the right usages.
Thanks.
- Jay
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for sharing your views on the issue. Now I have a better idea how to make comparisons between similar ipp functions.
I do share the same thought as jay. This means setting manually the number of cpu cores to use in order have a better performance overall.
Regards,
I2R D&T Team
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Right, you can set IPP threads manually based on your real application mode, for example, on 4 core machine, ippSetNumThreads (2) and leave 2 core for your other job.
And in sirail mode (ippSetNumThreads=1), the ippiResizeSqrPixel function is comparable to the disprecated functions in performance.
for exmaple,
ippiv8-6.1.dll 6.1 build 137.20 6.1.137.809
ippiResizeCenter:
IPP cpe = 42.1132
IPP time = 334.343ms
ippiResize:
IPP cpe = 62.7621
IPP time = 498.279ms
ippiResizeSqrPix
IPP cpe = 54.6182
IPP time = 433.622ms
Press any key to continue . . .
ippiResizSqrPixel should corresponding to IPPiResize(shift). If processed region is the same, comparing the ippiResizeCenter, the calculationsare almostthe same.
Regards,
Ying
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mr. Ying,
As far as I understand it the stepBytes parameter could not be a single one, as it is now, for images like YUV (YCbCr), where luma and chroma has different width.
I suspect this is the reason I'm crashed when I use ippiResizeSqrPixel()
Thanks
Gilad
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gilad,
Right, the current ippiResizeSqrPixel function only support single stepBytes. It assume the input iamge isgeneral image, like RGB image or YUV4:4:4, which have same widths in multi-channel. As I understand, you may handle theimage like YUV(YCbCr, i.e 4:2:0), then you need call the function planar by planar if the stepBytes is different planer by planer.
For example, ifresizethe frame in video,which is YUV 4:2:0, 3 planar, the resize function will be called three times.
Please try them and check the stepBytes1,stepBytes2,astepBytes3, and let us know if any problem.
Regards,
Ying
PS.one more resizefunction, ippiResizeYUV422_8u_C2R() can handle image format like YUV422 or YCrCb422
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just for who have high cpu usage issue when multi-thread is on , the below article may provide one possible cause,
High CPU usage and Intel IPP threaded function
Regards,
Ying
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page