Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Illegal instruction from custom 64 bit DLL

daven-hughes
Beginner
2,083 Views

Hi,

I built a custom 64 bit dll with an export.def for the function exports.

The dll code is directly from Intel code samples for building custom IPP dlls. I use ippStaticInit(), not ippStaticInitCPU(id) .. so there should not be a problem there. 

My system is i5 2500k, Windows 7, "x64 based PC"

The crash is on the vxorps instruction on the first call to ippsZero_32f

e9_ippsZero_32f:
[...]
000007FEE52284F6  jg          e9_ippsZero_32f+1Fh (7FEE52284FFh) 
000007FEE52284F8  call        e9_ownsZero_8u_E9 (7FEE52565C0h)

e9_ownsZero_8u_E9:
000007FEE52565C0 push rsi
000007FEE52565C1 push rdi
000007FEE52565C2 mov rdi,rcx
000007FEE52565C5 mov rsi,rdx
000007FEE52565C8 mov rax,rdi
000007FEE52565CB movsxd rsi,esi
000007FEE52565CE vxorps ymm0,ymm0,ymm0 ; illegal instruction 
000007FEE52565D2 xor rdx,rdx
000007FEE52565D5 cmp rsi,100h

Seems like this is something to do with AVX, but why would that be illegal and what should I do?

0 Kudos
29 Replies
Igor_A_Intel
Employee
728 Views

Could you attach your dll + reproducer in order to understand what is wrong and how you've managed to bypass OS-support check for AVX? IPP dispatcher checks both AVX bit from CPUID and that AVX is supported by OS, and dispatches AVX ONLY and ONLY if both conditions are true.

regards, Igor

0 Kudos
daven-hughes
Beginner
728 Views

Attached the dll+reproducer. The dll code itself is doing nothing fancy:

#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <ipp.h>

BOOL WINAPI DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved)
{
switch(fdwReason)
{
case DLL_PROCESS_ATTACH:
{
if(ippInit() != ippStsNoErr) return false;
}

default:
hinstDLL;
lpvReserved;
break;
}

return true;
}

0 Kudos
SergeyKostrov
Valued Contributor II
728 Views
>>...Attached the dll+reproducer... This is simply to let you know that test application crashed on my Ivy Bridge system ( Intel Core i7-3840QM ( 2.80 GHz ) / Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/compare/70846 ).
0 Kudos
SergeyKostrov
Valued Contributor II
728 Views
>>... >>IppStatus st = ::ippsZero_32f( &fData[0], 256 ); >>... Are there more IPP functions with similar problems, that is, with AVX related crashes?
0 Kudos
Igor_A_Intel
Employee
728 Views

Hi, I tried this code on the machine with SP1:

TID0: INS 0x000007fee7801348             BASE     or rdi, rax                          | rdi = 0x1ff, rflags = 0x206
TID0: Read 0x306c1 = *(UINT32*)00000000002FEE30
TID0: INS 0x000007fee780134b             BASE     mov r13d, dword ptr [rsp+0x20]       | r13 = 0x306c1
TID0: INS 0x000007fee7801350             BASE     cmp edx, 0x18000000                  | rflags = 0x246
TID0: INS 0x000007fee7801356             BASE     jnz 0x7fee7801363
TID0: INS 0x000007fee7801358             BASE     call 0x7fee7b30126                   | rsp = 0x2fee08
TID0: Write *(UINT64*)00000000002FEE08 = 0x7fee780135d
TID0: INS 0x000007fee7b30126             BASE     push rbx                             | rsp = 0x2fee00
TID0: Write *(UINT64*)00000000002FEE00 = 0x1
TID0: INS 0x000007fee7b30127             BASE     mov eax, 0x1                         | rax = 0x1
TID0: INS 0x000007fee7b3012c             BASE     cpuid                                | rax = 0x306c1, rbx = 0x1100800, rcx = 0x7ffaf3ff, rdx = 0xbfebfbff
TID0: INS 0x000007fee7b3012e             BASE     xor eax, eax                         | rax = 0, rflags = 0x246
TID0: INS 0x000007fee7b30130             BASE     and ecx, 0x18000000                  | rcx = 0x18000000, rflags = 0x206
TID0: INS 0x000007fee7b30136             BASE     cmp ecx, 0x18000000                  | rflags = 0x246
TID0: INS 0x000007fee7b3013c             BASE     jnz 0x7fee7b30154
TID0: INS 0x000007fee7b3013e             BASE     xor ecx, ecx                         | rcx = 0, rflags = 0x246
TID0: INS 0x000007fee7b30140             XSAVE    xgetbv                               | rdx = 0, rax = 0x7
TID0: INS 0x000007fee7b30143             BASE     mov ecx, eax                         | rcx = 0x7
TID0: INS 0x000007fee7b30145             BASE     xor eax, eax                         | rax = 0, rflags = 0x246
TID0: INS 0x000007fee7b30147             BASE     and ecx, 0x6                         | rcx = 0x6, rflags = 0x206
TID0: INS 0x000007fee7b3014a             BASE     cmp ecx, 0x6                         | rflags = 0x246
TID0: INS 0x000007fee7b3014d             BASE     jnz 0x7fee7b30154
TID0: INS 0x000007fee7b3014f             BASE     mov eax, 0x1                         | rax = 0x1
TID0: Read 0x1 = *(UINT64*)00000000002FEE00
TID0: INS 0x000007fee7b30154             BASE     pop rbx                              | rbx = 0x1, rsp = 0x2fee08
TID0: Read 0x7fee780135d = *(UINT64*)00000000002FEE08
TID0: INS 0x000007fee7b30155             BASE     ret                                  | rsp = 0x2fee10
TID0: INS 0x000007fee780135d             BASE     shl eax, 0x9                         | rax = 0x200, rflags = 0x206
TID0: INS 0x000007fee7801360             BASE     or rdi, rax                          | rdi = 0x3ff, rflags = 0x206

it is visible (xgetbv instrunction) that OS is checked for AVX support. Currently I'm waiting for AVX machine with Windows 7 and without SP1 - so I'll update on my findings after that.

regards, Igor

0 Kudos
SergeyKostrov
Valued Contributor II
728 Views
>>>>... >>>>IppStatus st = ::ippsZero_32f( &fData[0], 256 ); >>>>... >> >>Are there more IPP functions with similar problems, that is, with AVX related crashes? If ippsZero_32f function is used in some production software than I would suggest a workaround based on a call to CRT function memset.
0 Kudos
Igor_A_Intel
Employee
728 Views

Hi daven-hughes,

it is your bug: I've investigated exe and dll you've provided - you took ipps library from one IPP 7.0 update and ippcore library from another - this is the main issue - they are incompatible from the dispatching point of view - the initial version of 7.0 didn't have w7/m7 code/libraries - so it supported only 5 cpu-specific libraries, while for the later 7.0.x updates w7/m7 code had been restored that means 6 cpu-specific libraries. In your case ippcore function ippInit detects correct set of supported features and that your OS doesn't support AVX - so it dispatches AVX-1 cpu (index=4), but for the "old" ipps library this index corresponds to the last cpu - so to AVX code. You can easily check this fact including ippGetLibVersion (for ippcore) and ippsGetLibVersion (for ippSP) - these versions MUST be the same.

regards, Igor

0 Kudos
daven-hughes
Beginner
728 Views

My mistake then, thanks Igor and everyone else for your help. 

0 Kudos
daven-hughes
Beginner
728 Views

Well, that said, I checked programmatically using ippsGetLibVersion() / ippGetLibVersion() that both ippcore_l.lib and ipps_l.lib were from build 7.0.205.40, and I only have installed the 64 bit libs once so that only makes sense.

Ahh! I just got it - somehow I had the IPP 6's ippsemergedem64t.lib linked, god knows why maybe because I thought there were a couple of functions removed from v7.0, but changing the link order so that ipps_l.lib was first, fixed it. 

0 Kudos
Reply