Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

ippiFilter_32f not multithreaded

Thomas_Jensen1
Beginner
499 Views

I'm using IPP 6.1.4 to filter 16u grayscale images, using the function ippiFilter_32f_C1R().

I noticed it was "slow" on my quad core processor, so I was thinking that maybe it did not use all 4 cores.

It only consumed 25% of the processor, indicating a single thread. My app is single threaded, but IPP should be able to use more core by OMP. I did enable OMP, and I did link to the threaded libraries (_t.lib).

In doc\\ThreadedFunctionsList.txt, ippiFilter_32f is listed.

I'm sure OMP is enabled, because the UIC Jpeg2000 code is significantly faster when I do SetNumThreads(4).

Are there seperate SetNumThreads for IPP Core, and for UIC samples?

I tested with libraries v8t and t7t (on appropiate cpu systems).

I do call InitStatic(), and it does indicate v8 or t7.

0 Kudos
1 Solution
Vladimir_Dudnik
Employee
499 Views

Hi Thomas,

Threading in UIC samples implemented on top of IPP in contarst to threading for IPP functions which is implemented inside of IPP functions. So you need to call ippSetNumThreads to control IPP threading. Alsothreading depend on image size (specific for each IPP function) and started to work from some boundary size to prevent slowdown from threaading of processing small images.

Regards,
Vladimir

View solution in original post

0 Kudos
5 Replies
Vladimir_Dudnik
Employee
500 Views

Hi Thomas,

Threading in UIC samples implemented on top of IPP in contarst to threading for IPP functions which is implemented inside of IPP functions. So you need to call ippSetNumThreads to control IPP threading. Alsothreading depend on image size (specific for each IPP function) and started to work from some boundary size to prevent slowdown from threaading of processing small images.

Regards,
Vladimir

0 Kudos
Thomas_Jensen1
Beginner
499 Views

My image is 3000x1500 at 16u_C1, so that size would most probably enable threading.

I'll take a look at ippSetNumThreads.

0 Kudos
Thomas_Jensen1
Beginner
499 Views

After I manually called ippSetNumThreads(), the speed went up.

Somehow, IPP did not initialize the number of threads to use.

0 Kudos
Gennady_F_Intel
Moderator
499 Views

it looks like the error, because of the default number of threads IPP running on is equal to the number of processor in the system. You mentioned v8t and t7t, therefore it's ia32 architecture. Which is OS? it need to check that behaiviour on our side orCould you check this example when dynamic linkage is using?

--Gennady

0 Kudos
Thomas_Jensen1
Beginner
499 Views

It is all 32bit, and I tested on AMD quadcore (Windows 7) and Xeon quadcore (Windows 2003).

My code use custom DLL linking, where I have created a single DLL containing 4 cpu libraries and a subset of the functions, compiled with Intel C++ 10.x with OMP support.

In another topic (today), I wrote about that I find it confusing with ippInit(). As a programmer, I fully understand the need to call an init-function in a library. However, since I'm using ippmerged.c, and since it has its own IppInitStatic() to initialized the dispatcher (addressbook) in my DLL, and since I see that that init code does not call ippInit(), then I'm confused.

Now, I have added code in my DLL:

- Call InitStatic() in ippmerged.c (to set the addressbook in the DLL).

- Call ippInit() in ippcore.h (to initialize the IPP library, including the default number of threads to use).

However, when the documentation states that ippInit() will initialize the dispatcher, I say, "what dispatcher?", since I already have code that dispatches function call to the proper cpu variant (using ippmerged.c).

Can you clarify?

0 Kudos
Reply