Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

CPU Problem

Lamp
Beginner
284 Views
Hi,

We are using IPP LIB to process images, the original CPU is INTEL Q9650 4 Cores, 2.66 GHz, with 12 MB L2 cache.

Now we buy a new computer, and CPU is INTEL i7-870 4 Cores, 2.93 GHz, with 1 MB L2 cache and 8 MB L3 cache.

The new computer supposed to be more faster than the original one, and the actual result is the same as expected;but if we want to run two thread parallel, the new computer with i7-870 has a long time delay.

Our image process only invokes the basic function of IPP, we are worndering if the problem is related to the cache of CPU, since for i7-870 there is only 1 MB L2 cache, compare to 12 MB L2 cache for Q9650.

Anyone meets the same problem? or there is any solution?


Regards,
0 Kudos
1 Solution
PaulF_IntelCorp
Employee
284 Views
Hello Xie,

I suspect your problem is related to hyper-threading. See this page for a comparison of the two processors:

http://ark.intel.com/Compare.aspx?ids=35428,41315

Along with the size and type of cache and memory bus, HT is a key difference between these two machines. Try adding an ippSetNumThreads(4) to the initialization part of your application to see if that makes a difference.

Paul

p.s. Please see this KB article for more information:
http://software.intel.com/en-us/articles/performance-of-crypto-sample-for-openssl-slowing-down-on-hyper-threading-systems/

View solution in original post

0 Kudos
6 Replies
Vladimir_Dudnik
Employee
284 Views
Hello,

if you only use basic IPP functions, you may try to use Deferred Mode Image Processing layer (DMIP) which was built on top of IPP and dedicated to build parallel processing pipeline. DMIP is available in IPP sample package, check folder image-processing\dmip

Regards,
Vladimir
0 Kudos
Lamp
Beginner
284 Views
Hi,

I understand DMIP will improve the performance of parallel processing, but we stillconcern why Q9650 can work smoothly but I7 cannot; tecnically i7 is stronger than Q9650.

We don't want to waste time on updating our application to move to DMIP, please tell us if i7 is really not good for process image parallel, then we change the CPU back to Q9650.

Reagrds,
Xie
0 Kudos
Vladimir_Dudnik
Employee
284 Views
Hi Xie,

performance depend on a number of factors. It is difficult to tell you what might be tuned for better performance in your application without examine whole source code. That's why I suggested you to try DMIP which should almost automatically tune your processing pipeline for best performance.

Regards,
Vladimir
0 Kudos
Thomas_B_3
Beginner
284 Views
Hello,

I agree with vladimir, that with such a short description of your subject it is difficult to give a well educated guess.

Can you give a more detailed description of what is meant with "but if we want to run two thread parallel"? Do you have two instances of your program running? If you have hyper-threading activated on your Core i-7, then this may cause performance issues, see documentation for MKL. Hyper-threading really helps if the two threads running on the same physical core perform different tasks. Highly parallelized algorithms running the same code in each thread do not benefit significantly if two threads run on shared ressources of one core. There are several threads in this forum on this topic. In the end it depends on what you mean with "but if we want to run two thread parallel".

Best regards,
TJ
0 Kudos
PaulF_IntelCorp
Employee
285 Views
Hello Xie,

I suspect your problem is related to hyper-threading. See this page for a comparison of the two processors:

http://ark.intel.com/Compare.aspx?ids=35428,41315

Along with the size and type of cache and memory bus, HT is a key difference between these two machines. Try adding an ippSetNumThreads(4) to the initialization part of your application to see if that makes a difference.

Paul

p.s. Please see this KB article for more information:
http://software.intel.com/en-us/articles/performance-of-crypto-sample-for-openssl-slowing-down-on-hyper-threading-systems/
0 Kudos
Lamp
Beginner
284 Views
Thanks Paul,

yes you are right, ippSetNumThreads(4) is the solution.

thanks all ofyoufor the kindness help.

Regards,
Xie
0 Kudos
Reply