Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Stefan_B_3
Beginner
243 Views

IPP 2017 running on Linux Ubuntu 8.04 (32-bit)

Hi,

I am aware this Linux distribution is not officially supported, but here: https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-2017-sys... it says 'Note: Intel® IPP is expected to work on many more Linux distributions as well. Let us know if you have trouble with the distribution you use.' So I thought it might be OK to bring up this question here.

Now what happens on the platform mentioned in the thread topic is that whenever I call at least on IPP function (which is ippInit in this case) the process later does not terminate. Attaching a debugger shows always one thread with the following stack trace:

#0  0xb7691904 in ipp_is_GenuineIntel ()
#1  0xb7691236 in ippInit ()

Building the same software on a Kubuntu 8.04 (64-bit) runs just fine. We are linking statically.

Regards,

Stefan

 

 

0 Kudos
18 Replies
Ying_H_Intel
Employee
243 Views

Hi Stefan,

thanks for the report. We will check it.  Could you please tell the exact IPP version and how you link IPP in your executable? do you install the IPP 2017 on the Ubuntu 8.04 (32-bit)?  or just copy the executable file to the machine.

Best Regards,

Ying

 

Stefan_B_3
Beginner
243 Views

Dear Ying,

thanks for getting back to me. The version of the IPP is 2017.0.0 as taken from ippversion.h. We do not link the IPP to executables but to *.so files which get loaded using 'dlopen' as runtime by an executable.

Linking is happening more or less like this:

/usr/bin/g++ -shared -Wl,-soname,libXYZ.so -o ../../lib/x86/libXYZ.so.2.17.4 XYZ.o  -lpthread -L../../lib/x86 ../../Toolkits/ipp/lib/linux/ia32/libippcc.a ../../Toolkits/ipp/lib/linux/ia32/libippi.a ../../Toolkits/ipp/lib/linux/ia32/libipps.a ../../Toolkits/ipp/lib/linux/ia32/libippcore.a -lrt -lc

The libs you see here is all we use from the IPP. The IPP is NOT installed on the target system. The system we build on is also the system we later run the code on.

Basically running happens like this:

An executable dynamically loads multiple shared objects all linked like described above, thus in process memory I would assume the IPP related code to reside multiple times(ones per *.so). As running on Linux this should result in only the first *.so-files code to be actually executed but this should not make a difference should it? Of course on could argue that in such a case linking dynamically to the IPP might save disc/memory space but for several reasons this is not an option in our scenario.

Hope this makes sense!

Regards,

Stefan

Ying_H_Intel
Employee
243 Views

Hi Stefan, 

It should be ok to build dynamic library libXYZ.so  based on the IPP static library like libippi.a libipps.a libippcore.a. 

Do you call the function ippInit() in your code?  If yes, how about to remove the call? 

As the function ippInit()  are not needed to be explicitly call since IPP 9.0. (now library performs auto-initialization itself during the first call of any IPP function (that is not from ippCore domain).)

Best Regards,

Ying 

P.S  What is your CPU type? We happened to discuss another ippinit() issue in https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/702147, for your reference

Jing_X_Intel
Employee
243 Views

Hi Stefan,

Could you provide a small test case to us for investigation, please?

-------------------------

btw., would "/usr/bin/g++ -fPIC -shared -Wl,-soname,libXYZ.so.2 -o ../../lib/x86/libXYZ.so.2.17.4 XYZ.o  -lpthread -L..." be helpful?

Regards,

Jing

Stefan_B_3
Beginner
243 Views

Hi there and sorry for the long silence! Since migrating from IPP 2018.3 to 2019.0 this problem has reappeared. I general everything stated here is still true. In addition to that we do compile using -fPIC (this answer I was missing the last time)

Igor_A_Intel
Employee
243 Views

Hi Stefan,

Could you provide a disassemble of the loop for #0? it is very simple function that can't come into the infinite loop:

it just makes cpuid and compares result for "GenuineIntel" string. 13 lines of code.

regards, Igor

 

 

Stefan_B_3
Beginner
243 Views

Dear Igor,

not quite sure what I was expected to provide, but I got the following stack trace:

#0  0xb771aa82 in ipp_is_GenuineIntel () from ...
#1  0xb771a396 in ippInit () from ...
#2  0xb5997c98 in ?? () ...
#3  0x0fc08500 in ?? ()

Attached you find a file containing the memory of that process from

0xb771a000 to 0xb771b000

dumped using the following gdb command:

dump memory /tmp/dump.hex 0xb771a000 0xb771b000

Does that help you? If not what exactly do you want me to do? And by the way: What is the difference of the static libs located in the 'ia32_and' and the 'ia32_lin'? Is 'and' Android by any chance? Is there documentation explaining the different versions of the libs somewhere? All I find is some sentences about threaded and single thread versions but that didn't really help a lot.

Regards,

Stefan

Igor_A_Intel
Employee
243 Views

hi Stefan,

I disassembled your dump:

XDIS a75: PUSH      BASE       55                       push ebp
XDIS a76: DATAXFER  BASE       89E5                     mov ebp, esp
XDIS a78: PUSH      BASE       51                       push ecx
XDIS a79: PUSH      BASE       52                       push edx
XDIS a7a: PUSH      BASE       53                       push ebx
XDIS a7b: DATAXFER  BASE       B800000000               mov eax, 0x0
XDIS a80: MISC      BASE       0FA2                     cpuid 
XDIS a82: LOGICAL   BASE       31C0                     xor eax, eax
XDIS a84: BINARY    BASE       81F96E74656C             cmp ecx, 0x6c65746e
XDIS a8a: COND_BR   BASE       7515                     jnz 0xaa1
XDIS a8c: BINARY    BASE       81FA696E6549             cmp edx, 0x49656e69
XDIS a92: COND_BR   BASE       750D                     jnz 0xaa1
XDIS a94: BINARY    BASE       81FB47656E75             cmp ebx, 0x756e6547
XDIS a9a: COND_BR   BASE       7505                     jnz 0xaa1
XDIS a9c: DATAXFER  BASE       B801000000               mov eax, 0x1
XDIS aa1: POP       BASE       5B                       pop ebx
XDIS aa2: POP       BASE       5A                       pop edx
XDIS aa3: POP       BASE       59                       pop ecx
XDIS aa4: POP       BASE       5D                       pop ebp
XDIS aa5: RET       BASE       C3                       ret 

your bt shows 0xb771aa82 in ipp_is_GenuineIntel ()- this corresponds to "a82" above - does it mean that your thread hangs after cpuid instruction?

If you can perform bt in gdb - could you perform "disass" cmd from gdb? what does "disassemble" cmd shows at the screen? could you perform "si" command? which address/function name do you see? you can insert the next cmd:" display /i $pc " and then use "si" cmd to understand in which loop you are.

regards, Igor

 

Stefan_B_3
Beginner
243 Views

Dear Igor,

there you go:

#28 0x0833d70d in main ()
(gdb) t 11
[Switching to thread 11 (Thread 0xb3366b90 (LWP 17036))]#0  0xb76b62a2 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
(gdb) bt
#0  0xb76b62a2 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
#1  0xb76b5bb6 in ippInit () from /tmp/myProject/lib/libMyLib1.so.2
#2  0xb454588c in ?? () from /tmp/myProject/lib/libMyLib2.so
#3  0x0fc08500 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) disass
Dump of assembler code for function ipp_is_GenuineIntel:
0xb76b6295 <ipp_is_GenuineIntel+0>:     push   %ebp
0xb76b6296 <ipp_is_GenuineIntel+1>:     mov    %esp,%ebp
0xb76b6298 <ipp_is_GenuineIntel+3>:     push   %ecx
0xb76b6299 <ipp_is_GenuineIntel+4>:     push   %edx
0xb76b629a <ipp_is_GenuineIntel+5>:     push   %ebx
0xb76b629b <ipp_is_GenuineIntel+6>:     mov    $0x0,%eax
0xb76b62a0 <ipp_is_GenuineIntel+11>:    cpuid  
0xb76b62a2 <ipp_is_GenuineIntel+13>:    xor    %eax,%eax
0xb76b62a4 <ipp_is_GenuineIntel+15>:    cmp    $0x6c65746e,%ecx
0xb76b62aa <ipp_is_GenuineIntel+21>:    jne    0xb76b62c1 <ipp_is_GenuineIntel+44>
0xb76b62ac <ipp_is_GenuineIntel+23>:    cmp    $0x49656e69,%edx
0xb76b62b2 <ipp_is_GenuineIntel+29>:    jne    0xb76b62c1 <ipp_is_GenuineIntel+44>
0xb76b62b4 <ipp_is_GenuineIntel+31>:    cmp    $0x756e6547,%ebx
0xb76b62ba <ipp_is_GenuineIntel+37>:    jne    0xb76b62c1 <ipp_is_GenuineIntel+44>
0xb76b62bc <ipp_is_GenuineIntel+39>:    mov    $0x1,%eax
0xb76b62c1 <ipp_is_GenuineIntel+44>:    pop    %ebx
0xb76b62c2 <ipp_is_GenuineIntel+45>:    pop    %edx
0xb76b62c3 <ipp_is_GenuineIntel+46>:    pop    %ecx
0xb76b62c4 <ipp_is_GenuineIntel+47>:    pop    %ebp
0xb76b62c5 <ipp_is_GenuineIntel+48>:    ret    
End of assembler dump.
(gdb) si
0xb76b62a4 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62a4 <ipp_is_GenuineIntel+15>:    cmp    $0x6c65746e,%ecx
(gdb) si
0xb76b62aa in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62aa <ipp_is_GenuineIntel+21>:    jne    0xb76b62c1 <ipp_is_GenuineIntel+44>
(gdb) si
0xb76b62ac in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62ac <ipp_is_GenuineIntel+23>:    cmp    $0x49656e69,%edx
(gdb) si
0xb76b62b2 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62b2 <ipp_is_GenuineIntel+29>:    jne    0xb76b62c1 <ipp_is_GenuineIntel+44>
(gdb) si
0xb76b62b4 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62b4 <ipp_is_GenuineIntel+31>:    cmp    $0x756e6547,%ebx
(gdb) si
0xb76b62ba in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62ba <ipp_is_GenuineIntel+37>:    jne    0xb76b62c1 <ipp_is_GenuineIntel+44>
(gdb) si
0xb76b62bc in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62bc <ipp_is_GenuineIntel+39>:    mov    $0x1,%eax
(gdb) si
0xb76b62c1 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62c1 <ipp_is_GenuineIntel+44>:    pop    %ebx
(gdb) si
0xb76b62c2 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62c2 <ipp_is_GenuineIntel+45>:    pop    %edx
(gdb) si
0xb76b62c3 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62c3 <ipp_is_GenuineIntel+46>:    pop    %ecx
(gdb) si
0xb76b62c4 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62c4 <ipp_is_GenuineIntel+47>:    pop    %ebp
(gdb) si
Cannot access memory at address 0x35
(gdb) c
Continuing.

Program received signal SIGINT, Interrupt.
[Switching to Thread 0xb72e86e0 (LWP 17022)]
0xb76fc410 in __kernel_vsyscall ()
1: x/i $pc
0xb76fc410 <__kernel_vsyscall+16>:      pop    %ebp
(gdb) t 11
[Switching to thread 11 (Thread 0xb3366b90 (LWP 17036))]#0  0xb76b62a2 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
(gdb) si    
0xb76b62a4 in ipp_is_GenuineIntel () from /tmp/myProject/lib/libMyLib1.so.2
1: x/i $pc
0xb76b62a4 <ipp_is_GenuineIntel+15>:    cmp    $0x6c65746e,%ecx
(gdb)

From what that tells me this is some kind of infinite loop. After I saw this I also checked using 'top' that the process indeed uses 100% CPU time. Whenever I break I will end up at 0xb76b62a4 <ipp_is_GenuineIntel+15> in this thread. I can then step through until the error (address 0x35 in the ret statement) you can see. I tried this multiple times...

Is there a chance that I did link the 'wrong' libs? Which static libs (from which rpm/folder) should I use? There are 4 versions: threaded, nonpic, ia32_and(what are these for) and ia32_lin(the ones I use right now).

Any other idea what could cause this?

Regards,

Stefan

 

Igor_A_Intel
Employee
243 Views

hi Stefan,

looks like the stack frame is corrupted. 0x35 is bad address. "and" libs (your assumption is right) are intended for Android (are built with Android NDK). Could you try to debug your app from the very beginning - to set breakpoint at ippInit() before your *.so are loaded? And then try to pass it till ret? Did you try to link IPP prebuilt shared libraries? Also with each IPP release we provide a sample "custom so" - did you try to build your "so" with IPP tool?

regards, Igor

 

Stefan_B_3
Beginner
243 Views

Dear Igor,

does this make any sense to you:

Breakpoint 2, 0xb773d7a0 in ippInit () from /tmp/mvIMPACT_Acquire_Build_x86/lib/libmvDeviceManager.so.2
1: x/i $pc
0xb773d7a0 <ippInit>:   push   %ebx
(gdb) si
0xb773d7a1 in ippInit () from /tmp/mvIMPACT_Acquire_Build_x86/lib/libmvDeviceManager.so.2
1: x/i $pc
0xb773d7a1 <ippInit+1>: push   %ebp
(gdb) bt
#0  0xb773d7a1 in ippInit () from /tmp/myProject/lib/libmylib.so.2
#1  0xb3dd26e0 in ippiConvert_8u16u_C1R () from /tmp/myProject/lib/libmyOtherlib.so
#2  0xb3a68033 in mv::CFltFormatConvert::Mono8ToMono16 (pSrcLayout2D=0x85a50bc, pDstLayout2D=0x85aeb0c, roiWidth=1024, roiHeight=1024, lshift=2)
    at /home/mvimpact/buildAgent/SSD/work/c37cf376f6077594/DriverBase/FilterFormatConvert.cpp:2934
// more stack data here (still correct)
#27 0xb731f4fb in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#28 0xb7409f5e in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) si
0xb773d7a2 in ippInit () from /tmp/myProject/lib/libmylib.so.2
1: x/i $pc
0xb773d7a2 <ippInit+2>: sub    $0x1c,%esp
(gdb) bt
#0  0xb773d7a2 in ippInit () from /tmp/myProject/lib/libmylib.so.2
#1  0xb3dd26e0 in ippiConvert_8u16u_C1R () from /tmp/myProject/lib/libmyOtherlib.so
#2  0xb3a68033 in mv::CFltFormatConvert::Mono8ToMono16 (pSrcLayout2D=0x8d004a9e, pDstLayout2D=0x918b100c, roiWidth=52516, roiHeight=1083286783, lshift=-402652998)
    at /home/mvimpact/buildAgent/SSD/work/c37cf376f6077594/DriverBase/FilterFormatConvert.cpp:2934
#3  0xdfba5800 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

Regards,

Stefan

Igor_A_Intel
Employee
243 Views

Hi Stefan,

as I see from your trace - you don't call ippInit() explicitly, the first IPP function call is in ippiConvert_8u16u_C1R (), that initiates implicit initialization. Of course it's interesting to know where points %esp at 

1: x/i $pc
0xb773d7a2 <ippInit+2>: sub    $0x1c,%esp
because in the previous trace we saw %esp=0x35 (according to gdb) that is impossible - it is wrong address and wrong alignment (any push/pop must be aligned at least on 4-byte boundary), and the previous 

0xb76b62c3 <ipp_is_GenuineIntel+46>:    pop    %ecx
was successful.

I have 2 proposals:

1) try to call ippInit() explicitly at the very beginning of your application, before any threading. May be this will help.

2) set a breakpoint in ippInit() at the address where ipp_is_GenuineIntel() is called and skip this call (set pc = the next instruction after this call) - it is interesting to understand where is the root of this issue.

regards, Igor

Stefan_B_3
Beginner
243 Views

Hi Igor,

I did a couple of tests again:

- I can call ippInit in any of my threads and the main thread as many times as I want. These work and don't change the behaviour

- I deliberately removed all direct ippInit calls some years back as it was recommended in one of your release notes
- I step through the code ones more and from what I see ippInit is called every times another IPP function gets called
- It doesn't seem to matter which function is called. I tried ippiLShiftC_16u_C1IR, ippiConvert_8u16u_C1R

Now what I found out:

As I said I link statically. I have multiple *.so files which are linked with the *.a IPP files. One of these *.so files uses dlopen to load others when the process starts. The problem occurs when a library which has been loaded from this first one calls an IPP function the is NOT ippInit. As you can see from the stack traces even if 'libmyOtherlib.so' calls ippInit the actually symbol that gets called is located in 'libmylib.so' which is the very first library that gets loaded. Once I call the first 'real' IPP function, this symbol gets resolved within the library calling the function but then again internally ippInit gets called in the other initial library. Once this happens the stack gets corrupted.

I then removed all IPP related code from the initial library and rebuild the project. The problem is gone then!

I attached some gdb output again:

Breakpoint 1, 0xb775b740 in ippInit () from /tmp/myProject/lib/libmyLib.so.2
2: x/i $pc
0xb775b740 <ippInit>:   push   %ebx
1: x/i $pc
0xb775b740 <ippInit>:   push   %ebx
(gdb) bt
#0  0xb775b740 in ippInit () from /tmp/myProject/lib/libmyLib.so.2
#1  0xb3a83f78 in mv::CFltFormatConvert::Mono8ToMono16 (pSrcLayout2D=0x85a3f9c, pDstLayout2D=0x85ad9ec, roiWidth=1024, roiHeight=1024, lshift=2)
    at /home/mvimpact/buildAgent/SSD/work/c37cf376f6077594/DriverBase/FilterFormatConvert.cpp:2931
#2  0xb3a8489a in mv::CFltFormatConvert::Mono8ToMono16 (this=0x85ad9e8, pSrcLayout2D=0x85a3f9c, pDstLayout2D=0x85ad9ec, lshift=2)
    at /home/mvimpact/buildAgent/SSD/work/c37cf376f6077594/DriverBase/FilterFormatConvert.cpp:2981
// stack frames 1 - 2 are referring to code that resides in libmyOtherlib.so
// more stack data here (still correct)
#26 0xb733d4fb in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#27 0xb7427f5e in clone () from /lib/tls/i686/cmov/libc.so.6
(gdb) i r
eax            0x14     20
ecx            0x0      0
edx            0xb3012914       -1291769580
ebx            0xb42985a4       -1272347228
esp            0xb30129ac       0xb30129ac
ebp            0xb3012b08       0xb3012b08
esi            0x400    1024
edi            0x400    1024
eip            0xb775b740       0xb775b740 <ippInit>
eflags         0x200246 [ PF ZF IF ID ]
cs             0x73     115
ss             0x7b     123
ds             0x7b     123
es             0x7b     123
fs             0x0      0
gs             0x33     51
(gdb) si
0xb775b741 in ippInit () from /tmp/myProject/lib/libmyLib.so.2
2: x/i $pc
0xb775b741 <ippInit+1>: push   %ebp
1: x/i $pc
0xb775b741 <ippInit+1>: push   %ebp
(gdb) disass
Dump of assembler code for function ippInit:
0xb775b740 <ippInit+0>: push   %ebx
0xb775b741 <ippInit+1>: push   %ebp
0xb775b742 <ippInit+2>: sub    $0x1c,%esp
0xb775b745 <ippInit+5>: call   0xb775b74a <ippInit+10>
0xb775b74a <ippInit+10>:        pop    %ebx
0xb775b74b <ippInit+11>:        lea    0x446f2(%ebx),%ebx
0xb775b751 <ippInit+17>:        lea    0x8(%esp),%eax
0xb775b755 <ippInit+21>:        push   $0x0
0xb775b757 <ippInit+23>:        push   %eax
0xb775b758 <ippInit+24>:        call   0xb76ff89c <ippGetCpuFeatures@plt>
0xb775b75d <ippInit+29>:        test   %eax,%eax
0xb775b75f <ippInit+31>:        je     0xb775b77f <ippInit+63>
0xb775b761 <ippInit+33>:        mov    -0xe4(%ebx),%eax
0xb775b767 <ippInit+39>:        movq   -0x1c6bc(%ebx),%xmm0
0xb775b76f <ippInit+47>:        movq   %xmm0,(%esp)
0xb775b774 <ippInit+52>:        movl   $0x0,(%eax)
0xb775b77a <ippInit+58>:        call   0xb76ff07c <ippSetCpuFeaturesMask@plt>
0xb775b77f <ippInit+63>:        movq   0x10(%esp),%xmm0
0xb775b785 <ippInit+69>:        movq   %xmm0,(%esp)
0xb775b78a <ippInit+74>:        call   0xb770246c <ippSetCpuFeatures@plt>
0xb775b78f <ippInit+79>:        mov    %eax,%ebp
0xb775b791 <ippInit+81>:        call   0xb7702eec <ipp_is_GenuineIntel@plt>
0xb775b796 <ippInit+86>:        mov    $0x14,%edx
0xb775b79b <ippInit+91>:        test   %eax,%eax
0xb775b79d <ippInit+93>:        cmove  %edx,%ebp
0xb775b7a0 <ippInit+96>:        mov    %ebp,%eax
0xb775b7a2 <ippInit+98>:        add    $0x24,%esp
0xb775b7a5 <ippInit+101>:       pop    %ebp
0xb775b7a6 <ippInit+102>:       pop    %ebx
0xb775b7a7 <ippInit+103>:       ret    
0xb775b7a8 <ippInit+104>:       nopl   0x0(%eax,%eax,1)
0xb775b7b0 <ippInit+112>:       nopl   0x0(%eax)
0xb775b7b7 <ippInit+119>:       nopw   0x0(%eax,%eax,1)
End of assembler dump.
(gdb) si
0xb775b742 in ippInit () from /tmp/myProject/lib/libmyLib.so.2
2: x/i $pc
0xb775b742 <ippInit+2>: sub    $0x1c,%esp
1: x/i $pc
0xb775b742 <ippInit+2>: sub    $0x1c,%esp
(gdb) bt
#0  0xb775b742 in ippInit () from /tmp/myProject/lib/libmyLib.so.2
#1  0xb3a83f78 in mv::CFltFormatConvert::Mono8ToMono16 (pSrcLayout2D=Cannot access memory at address 0x1c
) at /home/mvimpact/buildAgent/SSD/work/c37cf376f6077594/DriverBase/FilterFormatConvert.cpp:2931
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)

Regards,

Stefan

 

Igor_A_Intel
Employee
243 Views

Hi Stefan,

I don't understand the last lines of gdb output "Cannot access memory at address 0x1c". (which instruction tries to access 0x1c address?) When you use IPP merged libraries - you use IPP static dispatcher. It selects the most relevant code path for you cpu. Algorithm is rather easy: we have one global variable 

extern int ippJumpIndexForMergedLibs;

initially it is set to (-1). And for each interface function we have a table that consists of addresses of this function optimized for different architectures:

idx     address

-1: opt_init()

0: opt_w7()

1: opt_s8()

etc...

therefore, if library is not initialized,  ippJumpIndexForMergedLibs = -1, than dispatcher calls function with index (-1) - opt_init(). This function calls ippInit(). ippInit() sets  ippJumpIndexForMergedLibs to the correct value that corresponds to cpu features. Then it calls the same function from the same table with this new index. Each next any IPP function call will lead to the direct call of the most suitable optimization for called function as  ippJumpIndexForMergedLibs has value >= 0 (remember - only (-1) leads to call of ippInit()). Something is wrong in your linkage if ippInit is called more than once - it means that you have several copies of ippCore (don't know where - in different *.so?). You can perform an experiment - just define extern int ippJumpIndexForMergedLibs; and set it to 0 (w7 code - corresponds to SSE2 cpu - the lowest we support) before any first call to any IPP function. If ippInit() still will be called in your application within a call to any IPP function - than it means that something is wrong in your linkage step.

regards, Igor

 

Stefan_B_3
Beginner
243 Views

Hi Igor,

I did the experiment you suggested but it didn't change anything. I verified that the variable was -1 before I modified it. I set it to 0 as you suggested but afterwards the very same thing happens.

I so far do not see what could be wrong with the way I link my code. You are absolutely correct: There are several *.so files which I create and a couple of these link the static versions of the IPP. As all the *.so files use different portions of the IPP functions I cannot see how this could be done differently (apart from linking dynamically but then the size of the stuff I need to ship explodes as the IPP is really large). And apart from that there could also be another third party library doing exactly the same thing. When used in the same process this then could never work. Please also note that this is just happening with some versions of the IPP and only on 32.bit platforms. However this could of course be a silly random coincidence. I currently have problems with IPP 2019.0 but e.g. 2018.3 works just fine!

And yes: The ippInit function is called within the first library loaded that contains ipp related code and the ipp function eventually creating the endless loop is located in a different *.so. It then internally jumps to the ippInit in the initial *.so. But is there anything I can do to change this?

Regards,

Stefan

Igor_A_Intel
Employee
243 Views

Hi Stefan,

if you have at least 2 *.so libs with linked-in IPP, even different domains/functions - all they are dependent on ippCore library. This means that you have at least 2 instances of Core in different *.so. I don't know linker logic in this case - which instance of Core lib in which call is accessed. I have one more proposal for you - to put all IPP functionalities you use into one single *.so library - we provide special tool for this. Just create an "export def" file with all functions you are going to use and apply "Intel® Integrated Performance Primitives Custom Library Tool". You'll get a small dynamic library with only functions you use. And it will have only one instance of ippCore library.

Regarding the previous experiment: it means that you changed ippJumpIndexForMergedLibs (global dispatcher index) for one instance of the ippCore library, but for another one it doesn't work - it has it's own global index with the same name and as it is not initialized (==-1) - ippInit() is called. ippInit() can't be called implicitly if the global index is >= 0. 

regards, Igor

Igor_A_Intel
Employee
243 Views

please take a look at the attached scheme - it is for Intel64 arch - but logic is absolutely the same as for ia32.

regards, Igor.

Stefan_B_3
Beginner
243 Views

Dear Igor,

to resolve this topic: I did follow your instructions and switched to build a custom library. This did indeed fix this issue!

Thanks,

Stefan

Reply