08-18-2011 11:43 AM
I'm getting performance 10 or 15 times slower if I copy to from system RAM to video RAM with ippiCopy after callingippStaticInitCpu with a value greater than or equal to ippCpuP4. Initialising with values of ippCpuPIII or less works fine.There are severe performance penalties in Windows DirectShow reading from video ram but I expected copying to video ram wouldn't be a problem.
I'm aware that this behaviour is likely to be highly system and BIOS dependent and that there are work arounds such as doingmy own manual dispatching for this operation to prevent usage of P4 or above features.
What worries me is what my general approach should be for avoiding problems like this on an arbitrary customer's system.
Is it generally safe to use to use IPP to output to video ram at all if reading from video ram is very expensive? Do some IPP functions that appear to only write to destination memory actually read from as an optimization technique? Should I be using ippiCopyManaged for this operation to force a particular caching strategy for safety though it's not available in the particular mode I'm using ?
My system config is:
x86 IPP 7.0 update 4 statically linked with dispatching in an x86 COM object
Visual Studio 2008 C++ x86
Dell Precision M5400
Windows7 x64 Professional
Intel Core i7 Q820 1.73GHz
Intel 5 series/3400 chipset
08-18-2011 08:56 PM
If the video data are not used in the future, it is better to use copying without caching the destination image. It can use ippiCopyManaged function to control the behavior.
08-19-2011 06:07 AM
Thanks for the suggestion.It looks like the problem is related to alpha channel processing.I should have said that I'm usinag ippiCopy_8u_AC4R. If I substitute ippiCopy_8u_C1R with a 4* wider ROI then the performance is great. For my purposes these are equivalent as the alpha channel is not used. The performance of ippiCopyManaged is pretty similar.Presumably 8u_AC4R is preserving any alpha information in the destination image by reading the destination image before bitwise combining the source image channel.
Are there any other subtle cases to watch for where IPP functions read from the destination?