Imagine I have a pipline:
Image in Row Major - Process block 1 - Convert image to col major (transpose) - Process block 2
Now, I can use IPP functions in either stage (1 or 2) by specifying the appropriate step size (1 or width of image) for source and destination images.
What is the performance impact if any? I tried an experiment with ippiResizeAntialiasing_8u_C1R, and I did not notice any significant performance difference (ipp 9.0, glnxa64, avx). Doing it in process block 1 (with step size ==1) had a very minor win over doing a similar operation in process block 2 (with step size == width of image).
Does this hold for all ipp functions? Or are there functions where the step size (and associated row/col majorness) will have a significant impact on a library function's performance?
I am afraid if two process(block1 & block2) could response same output? Could you please provide a test case, I will test and reproduce. Thank you.
My question is a general one, not really related to any specific IPP function.
In general, for any IPP function that needs a source and destination step size (e.g. ippiFilter_64f_C1R) does the value of the step size have any impact on the performance? If source step = 1, I assume achieving cache localization is easier than when its the image width.
The impact of step depends on size of your image, it is start point of consecutive lines of image. If you would like to zoom in/zoom out image with equal proportion. The step should be equals to width of src image/ dst image. If you set step to 1, could you make sure the output is correct? I am afraid the process of Block 1 & Block 2 could not do similar functionality. Could you please provide a test case, even input is a small matrix to see the output.
I simulate your process to create a test case, the output of process 1 & process 2 are different. You could check with attached code, is that what you mean? If it is, how the performance could be compared that they do different work? I feel a bit confused, please provide your test case, or modify directly in attached file. Thank you.
Fiona, thanks for taking the time to write a test case. From a quick look, I think you will also need to flip the destination size specification in the second call. I'll edit it to explain better what I mean
In the meantime, I want to reiterate that this is a conceptual question about the impact of source and destination step value on the execution performance.
1. 8x4 image in row major -> IPP function call (step size ==1) -> transpose -> output image
2. 8x4 image in row major -> transpose-> IPP function (step size==4) ->(potentially another transpose) -> output image
If step size ==1, then accessing successive pixels for a single raster is very cache friendly, if its large, like the image width, then I dont except the read/data access to be as cache friendly. I am wondering if IPP does anything special in its implementation to differentiate between step size ==1 and everything else.
Never mind - I think I completely misunderstood the implication of source step between row and column.
(i.e it doesnt matter, its just the length of the first dimension - whatever that happens to be row or column).
Apologies, and thanks for your patience Fiona!