Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Hang first time entering an IPP function

rich-kulakowski
Beginner
465 Views

Greetings,

I am converting from 4.1 to 5.3.4 and have it finally compiling/linking. When I try to run the first (?) function that gets called is probably an ippmTranspose_ma_64f_P function that hangs. Looking at the callstack in windbg, i have the following stack:

ChildEBP RetAddr
2df7e69c 7c90d85c ntdll!KiFastSystemCallRet
2df7e6a0 7c8023ed ntdll!NtDelayExecution+0xc
2df7e6f8 7c802451 kernel32!SleepEx+0x61
2df7e708 0278a4c3 kernel32!Sleep+0xf
WARNING: Stack unwind information not available. Following frames may be wrong.
2df7e714 02775cea libguide40!_kmp_reap_monitor+0x227
2df7e748 02775975 libguide40!_kmp_wait_sleep+0x162
2df7e79c 02774dc7 libguide40!_kmp_fork_call+0x2085
2df7e7b4 02774e83 libguide40!_kmp_fork_call+0x14d7
2df7e7dc 0277beec libguide40!_kmp_fork_call+0x1593
2df7e814 1de808bf libguide40!_kmpc_fork_call+0x34
2df7e8b8 1fc3f9c9 ippmv8_5_3!ippmTranspose_ma_64f_P+0xc580b
2df7e960 1fc3d0f8 GcCoordXformInterpFilter!BiCubicInterpolator::Interpolate+0x149

I set OMP_NUM_THREAD=1. It still appears to be trying to do multi-threading. Did I miss an initialize step somewhere? I couldn't find it in the docs but I have been going back and forth from 4.1 to 5.3 so much it wouldn't surprise me if it's in there.

It's a Core 2 Duo processor running WinXP Pro.

Thanks.

Rich.

0 Kudos
4 Replies
rich-kulakowski
Beginner
465 Views
Quoting - rich-kulakowski

Ok, I added an ippSetNumThreads(1) call in my mainline. That seems to allow it to come up. Is there an environment value that can be used for this in the IPP library? OMP_NUM_THREADS doesn't seem to work in this case. Also, will this impact using regsvr32 to register a dll that isn't the main one?
Thanks.

Rich.

0 Kudos
Vladimir_Dudnik
Employee
465 Views
Sorry, I did not really get what is your issue. What do you mean the first function which gets called hangs? How do you use this function? Do you have simple test case which reproduce that issue?
Regards,
Vladimir

0 Kudos
rich-kulakowski
Beginner
465 Views

Vladimir,

I'm sorry for not explaining myself more completely. Sometimes I forget everyone is not familiar with how I/we work here. Anyway...

No, we don't have a small sample that shows the issue. Our application is rather large. We have DLLs that include libs that have IPP calls in them. We do not use openmp for anything. Our application has 40 or so threads on its own that get created. We are running on Core 2 Duo processors, 2 cores with 4G ram. The system is a mix, most compiled with MSVCC and some libs/dlls compiled with the Intel compiler. These were historical, intense calculation libs that benefitted from the increased optimizations we got from ICL, 15% or more increase in most cases. Good job!!

We have reached a hold point in the project and are updating from ICL 9.1 and IPP 4.1 to ICL 10.1 and IPP 5.3. ICL 10.1 right now is causing us problems because it seems to have a problem doing regsvr32 calls on dlls that contain IPP libs linked in. I am working with Jennifer to resolve that. If I compile with 9.1, the regsvr32 calls work but running the applcation fails with a hang. The offending lib is as indicated in the first item above. The ippmTranspose call appears to hang trying to do libguide calls that try to initialize openmp, I found that I can avoid this hang by adding a call to ippSetNumThreads in our application near the start of it. My questions become the following.

1) Why does ipp assume it can create as many threads as there are processors? Shouldn't it default to the previous behavior and only generate more if allowed/requested?

2) It appears the Intel compiler uses an environment variable to control how many openmp threads are allowed. Why doesn't IPP look at the same one or one of its own?

3) Does going to IPP 6.0 (just released) with it's different openmp library change any of this?

Thanks. Rich.

0 Kudos
Vladimir_Dudnik
Employee
465 Views
Hi Rich,
it seems like that issue is related to delayed DLL load in Windows. To make IPP threading work correctly, OpenMP run-time should be initialised properly. Such initialization is performed in two cases (what happens first): load IPP core DLL into process or first call of threaded IPP function. An important note, OpenMP initialisation should be done in serial code, it might not work correctly when happens from some thread of your application when already many threads were created by other threading API. So, when you call SetNumThreads function (which defined in IPP core DLL) at the beginning of your application it cause to load IPP core DLL into your process memory and initialization of OpenMP run time. If you do not do this, and call ippmTranspose function from one of many threads created by your application it cause to attempt to initialize OpenMP run-time and this might not work correctly in this case. So our recomendation is to call any function from IPP core (for example, ippGetLibVersion) at the beginning of your application to make OpenMP initialization happen while no other threads created in your application. After that IPP threading should work corretly even after you create additional threads. If you want to disable IPP threading you can call ippSetNumThreads(1) this seems to be more useful from practical point of view than defining environment variable (provides the possibility to control IPP threading at run time).
Regards,
Vladimir

0 Kudos
Reply