Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6704 Discussions

Multi-core use by IPP v6.0 ippmInvert_ma_64f

pete_klostermanflash
325 Views
We have been using the IPP v5.1 small matrices libraries for matrix math; I recently dowloaded the IPP 6.0.1.070 libraries, and am investigating whether the threaded functions in v6.0 will give improved throughput on a multi-core computer.

In a trial app, which calls ippmInvert_ma_64f on moderately large matrices (e.g., one with 63 rows and columns, which takes a couple of seconds to invert), I have not seen both cores being used (taskmgr.exe CPU usage is no greater than 50%; the Intel Thread Profiler confirms that processing is all serial). I have a Core Duo T2500 processor, am compiling using Visual Studio 2008, with I believe the correct static libraries - the VCLinkerTool AdditionalDependencies are ippmemerged.lib ippmmerged_t.lib libiomp5mt.lib ippcore_t.lib and libircmt.lib. ippGetNumThreads says that I have two threads.

Maybe I'm misunderstanding what is meant by a threaded function - I was hoping that calling ippmInvert_ma_64f would generate two threads and do a large part of the processing in parallel.

Thanks in advance for your help,

Pete
0 Kudos
4 Replies
Vladimir_Dudnik
Employee
325 Views
Hi Pete,


not every IPP function is designed to launch threads internally but only those which may get performance benefits from threading. We do provide the list of functions which are threaded, please take a look at IPP documentation folder of your installation.

Regards,
Vladimir
0 Kudos
pete_klostermanflash
325 Views
Quoting - Vladimir Dudnik
Hi Vladimir,

I believe that the list of functions you're referring to is in the file ThreadedFunctionsList.txt, which includes ippmInvert_ma_64f. Why would Intel include ippmInvert_ma_64f in the list of threaded functions if there is no performance benefit from threading?

Pete

Hi Pete,

not every IPP function is designed to launch threads internally but only those which may get performance benefits from threading. We do provide the list of functions which are threaded, please take a look at IPP documentation folder of your installation.

Regards,
Vladimir

0 Kudos
Vladimir_Dudnik
Employee
325 Views
Ops, if this function is included into the list of threaded functions it mean it use internal threading in some conditions.

I've just checked the actual condition which turn on internal threading in this particular function. It is

2*count*widthHeight > 2500

The conditionis platform dependent and was chosen as a result of performance analyses. This particular function is threaded onlyin V8 and U8 IPP libraries.

Regards,
Vladimir
0 Kudos
pete_klostermanflash
325 Views
Quoting - Vladimir Dudnik
An interesting answer. On my Core Duo T2500 processor, I did not observe any parallel processing (monitored via taskmgr.exe) with a count of 2 and widthHeight values up to 3000.

The matrices we're talking about here are pretty large; count 2 and widthHeight 36 takes 5 seconds to process; if you assume that the matrix inversion execution time scales as widthHeight**2, inverting with a count of 2 and widthHeight of 1250 would take 1968.4 seconds or 32.8 minutes to invert on my box.

It looks like IPP v6.0 won't help for our application - we're inverting 3x3 matrices thousands of times per run, which takes less than a second using IPP 5.1; setting up a big array, with, say, 1000 matrices, would be inconvenient and would have its own problems - about 1% of the time our 3x3 matrices turn out to be non-invertible, so we have to re-condition them.

I wonder if Intel has considered making this kind of information (no multi-threading unless 2*count*widthHeight > 2500) available to the general public, to save time getting counsel from the Software Network Forums.

Pete

Ops, if this function is included into the list of threaded functions it mean it use internal threading in some conditions.

I've just checked the actual condition which turn on internal threading in this particular function. It is

2*count*widthHeight > 2500

The conditionis platform dependent and was chosen as a result of performance analyses. This particular function is threaded onlyin V8 and U8 IPP libraries.

Regards,
Vladimir

0 Kudos
Reply