We are using IPPI static multithreaded static libraries using dispatching. We are using the latest version of theIPPavailable as of 1-Jan-2010. We are seeing tremendous improvements over the single threaded libraries, often ~2x. We are doing no threaded image processing ourselves, just relying on the libraries. This product is really an excellent product.
Now our dilemma -- On XP, only one of the cores appears to "dance" when running our app and watching the graphic on MS's performance monitor. On our Win 7 machines, all cores "dance".
What's got me stumped is that *regardless of the OS (Win7 or XP)*, the speed of our algorithms is best predicted by the number of cores on the system. So on two identical machines, one with XP and the other with Win7, they'd pretty much be the same in timing when running our IPPI application despite what the graphics on the Microsoft Performance Monitor would suggest.
If this is the case, why do I care? Because I recently installed our application at a customer site using a suite of XEON 8 cores and the first thing the customer did was bring up that MS Performance monitor. He said why is only one core used? I had a dumb look and said I'd look into it.
I must understand this issue and very much appreciate any insights you could provide!
I think the picture you can see with Performance Monitor is some average processor load acrosssome time quants. If parallel regions in IPP function runs quickly you may not see all those peaks in processor's load with Performance Monitor. Visual demonstartion of IPP threading benefits with Performance Monitor is kind of special work, such application should be implemented by different way from usual application which dedicated to solve real problems. For example you may try to call IPP function on really big amount of data to be able todetect work in parallel regions.