Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.

Illegal instruction from custom 64 bit DLL

daven-hughes
Beginner
4,244 Views

Hi,

I built a custom 64 bit dll with an export.def for the function exports.

The dll code is directly from Intel code samples for building custom IPP dlls. I use ippStaticInit(), not ippStaticInitCPU(id) .. so there should not be a problem there. 

My system is i5 2500k, Windows 7, "x64 based PC"

The crash is on the vxorps instruction on the first call to ippsZero_32f

e9_ippsZero_32f:
[...]
000007FEE52284F6  jg          e9_ippsZero_32f+1Fh (7FEE52284FFh) 
000007FEE52284F8  call        e9_ownsZero_8u_E9 (7FEE52565C0h)

e9_ownsZero_8u_E9:
000007FEE52565C0 push rsi
000007FEE52565C1 push rdi
000007FEE52565C2 mov rdi,rcx
000007FEE52565C5 mov rsi,rdx
000007FEE52565C8 mov rax,rdi
000007FEE52565CB movsxd rsi,esi
000007FEE52565CE vxorps ymm0,ymm0,ymm0 ; illegal instruction 
000007FEE52565D2 xor rdx,rdx
000007FEE52565D5 cmp rsi,100h

Seems like this is something to do with AVX, but why would that be illegal and what should I do?

0 Kudos
29 Replies
Igor_A_Intel
Employee
3,054 Views

Hi,

could you run ippCpuInfo (available in ipp samples - it has pre-built executables) and publish here its output? 2nd generation Core supports AVX - so may be something is wrong with OS support.

regards, Igor

0 Kudos
daven-hughes
Beginner
3,054 Views

****************
The decoded data
****************

==================
Signature
Stepping ID 7
Model 10
Model + Ext. 42
Family 6
Family + Ext. 6
Type 0

BrandName
=================================================
Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz
=================================================

==============================================================
IPP would recommend using cpu_p8(y8) code for this processor

================
Feature Flags
================
Cores 4 - Number of cores per physical package
CMP / HTT 1 - Multi-Cores and/or Multi-Threading

MOVBE 0 - MOVBE instruction. For the first time in Atom(TM)
MMX 1 - Intel(R) Architecture MMX(TM) technology is supported
SSE 1 - Streaming SIMD Extensions is supported
SSE2 1 - Streaming SIMD Extensions 2 is supported
SSE3 1 - Streaming SIMD Extensions 3 is supported
SSSE3 1 - Supplemental Streaming SIMD Extensions 3 is supported
SSE41 1 - Streaming SIMD Extensions 4 (SSE4.1) is supported
SSE42 1 - Streaming SIMD Extensions 4 (SSE4.2) is supported
STTNI 0 - STTNI Instructions
EM64T 1 - Intel(R) Extended Memory 64 Technology is supported
AVX 1 - CPU supports Intel(R) Advanced Vector Extensions instruction set
AVX_OS 0 - OS supports Intel(R) AVX
AES 1 - AES instruction is supported
CLMUL 1 - PCLMULQDQ instruction is supported

So Windows 7 doesn't support it... :(

This will be a very common customer issue, so is there an easy way to prevent AVX instructions? Maybe I should init the dll with an older cpu id if I see AVX_OS = 0?

Thanks 

0 Kudos
daven-hughes
Beginner
3,054 Views

(dupe)

0 Kudos
daven-hughes
Beginner
3,054 Views

Interesting is why it selects cpu_e9 code by default when ippCpuInfo says "IPP would recommend using cpu_p8(y8) code for this processor"

0 Kudos
Sergey_K_Intel
Employee
3,054 Views

You probably need to install SP1 for your Windows 7 ?

Regards,
Sergey 

0 Kudos
daven-hughes
Beginner
3,054 Views

I understand, that would work for me the developer, but what can I do to support an unpatched Windows 7?

I'm not saying I need AVX for unpatched Win 7, just an option that doesn't cause illegal instructions.

Thanks

0 Kudos
Thomas_Jensen1
Beginner
3,054 Views

While waiting for a new IPP that selects y8 instead of e9, you'd have to use get cpu features and then call init cpu (your selected cpu), where your selected cpu is the one lower than avx if the os does not support avx.

Here is some of my code that selects an IPP cpu depending on features (32-bit case):

    lib_enum lib;
    Ipp64u pFeaturesMask;
    Ipp32u pCpuidInfoRegs[4];
    IppStatus status;

    status= ippInit();                    // init local ippCore
    if( status == ippStsNoErr )
        status= ippGetCpuFeatures( &pFeaturesMask, pCpuidInfoRegs );
    if( status != ippStsNoErr )            // error getting features
        lib= LIB_W7;                    // lowest supported is W7 = SSE2
    else if( (pFeaturesMask & (Ipp64u)(ippCPUID_AVX2)) &&  (pFeaturesMask & (Ipp64u)(ippAVX_ENABLEDBYOS)) )
        lib= LIB_H9;                    // AVX2
    else if( (pFeaturesMask & (Ipp64u)(ippCPUID_AVX)) &&  (pFeaturesMask & (Ipp64u)(ippAVX_ENABLEDBYOS)) )
        lib= LIB_G9;                    // AVX
    else if( pFeaturesMask & (Ipp64u)(ippCPUID_SSE42) )
        lib= LIB_P8;                    // SSE42
    else if( pFeaturesMask & (Ipp64u)(ippCPUID_SSSE3) ) {
        if( pFeaturesMask & (Ipp64u)(ippCpuBonnell) )
          lib= LIB_S8;                    // SSSE3 Atom optimized
        else
          lib= LIB_V8;                    // SSSE3
    } else
        lib= LIB_W7;

0 Kudos
daven-hughes
Beginner
3,054 Views

Thanks for that Thomas.

0 Kudos
Chao_Y_Intel
Moderator
3,054 Views

Hello, 

Which verions of IPP are using now?   It is support that Ippinit() function will check both of the OS, and supported CPU feature. 

Regards
Chao 

0 Kudos
daven-hughes
Beginner
3,054 Views

7.0.205

0 Kudos
SergeyKostrov
Valued Contributor II
3,054 Views
>>...7.0.205 I have that version of IPP library and I could verify ippsZero_32f function on Ivy Bridge ( i7 ). Let me know if that test case looks right as a reproducer: #include "ipps.h" int main( void ) { Ipp32f fData[ 256 ]; IppStatus st = ::ippsZero_32f( &fData[0], 256 ); return ( int )1; }
0 Kudos
SergeyKostrov
Valued Contributor II
3,054 Views
>>>>...7.0.205 >> >>I have that version of IPP library and I could verify ippsZero_32f function on Ivy Bridge ( i7 ). Daven, There are two news: A good one: I didn't have any issues or problems on Ivy Bridge system with IPP version 7.1. A not good one: Unfortunately, I don't have a set of 64-bit IPP DLLs for version 7.0.205. Here are all results of my verification: // Verification for DSP domain DLL ( AVX / e9 ) is needed /* List of IPP DLLs used: 24/09/2012 11:25 PM 144,864 ippcore-7.1.dll 24/09/2012 11:25 PM 240,608 ipps-7.1.dll 25/09/2012 01:21 AM 5,499,360 ippse9-7.1.dll */ #include "stdio.h" #include "ipps.h" int main( void ) { Ipp32f fData[ 256 ]; printf( "Test Started\n" ); IppStatus st = ::ippsZero_32f( &fData[0], 256 ); printf( "Test Completed\n" ); return ( int )1; } [ Output ] Test Started Test Completed Let me know if you have any questions.
0 Kudos
SergeyKostrov
Valued Contributor II
3,054 Views
Here are some additional technical details: Dell Precision Mobile M4700 Intel Core i7-3840QM ( Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/compare/70846 ) and the test case is attached.
0 Kudos
daven-hughes
Beginner
3,054 Views

Sergey Kostrov wrote:
>>...7.0.205

I have that version of IPP library and I could verify ippsZero_32f function on Ivy Bridge ( i7 ). Let me know if that test case looks right as a reproducer:

#include "ipps.h"

int main( void )
{
Ipp32f fData[ 256 ];

IppStatus st = ::ippsZero_32f( &fData[0], 256 );

return ( int )1;
}

Yes, that reproduced the illegal instruction error. 

0 Kudos
SergeyKostrov
Valued Contributor II
3,054 Views
>>...Yes, that reproduced the illegal instruction error... Use MsInfo32.exe and post a complete information about OS.
0 Kudos
SergeyKostrov
Valued Contributor II
3,054 Views
This is a short follow up and I'd like to note that functions ippsZero_xxx are Not in the list of IPP functions optimized to benefit from Haswell's new instructions. Take a look at: http://software.intel.com/en-us/articles/haswell-support-in-intel-ipp
0 Kudos
Igor_A_Intel
Employee
3,054 Views

Sergey,

this list is not fully precise - this list contains only functions that have got hand-developed optimization. It doesn't take into account functions that have nested calls to hand-optimized functions (for example convolution uses ippzero, etc.) and + 1 more thing - the whole library is built with icc/icl with the corresponding optimization switch - so new instructions can be inserted by compiler in ANY function.

regards, Igor

0 Kudos
SergeyKostrov
Valued Contributor II
3,054 Views
>>...this list is not fully precise - this list contains only functions that have got hand-developed optimization. It doesn't take >>into account functions that have nested calls to hand-optimized functions (for example convolution uses ippzero, etc.)... Thanks for the information and it would be nice to have a comment in the article about this. Please consider it as a Feature Request ( some kind ).
0 Kudos
daven-hughes
Beginner
3,054 Views

Either way, surely the disassembly shows that ymm* registers are being used, and to my knowledge they are AVX registers. 

I did more tests:

ippInit(), ippInitCpu(ippCpuSSE42),  and ippInitCpu(ippCpuSSE41) choose the e9_ippsZero_32f code, and crash with the illegal instruction error
ippInitCpu(ippCpuSSE3) chooses the m7_ippsZero_32f code and doesn't crash

I should repeat this is only for 64 bit; 32 bit seems to choose the right code with just ippInit().

So, according to the ippCpuInfo app, I should be selecting cpu_y8 code for my condition (AVX cpu but no AVX os), though this isn't an option from the above tests. 

I guess the only thing to do is update the IPP license...

0 Kudos
SergeyKostrov
Valued Contributor II
2,921 Views
>>...ippInitCpu( ippCpuSSE3 ) chooses the m7_ippsZero_32f code and doesn't crash ... m7 - Optimized for processors with Intel SSE3 ... y8 Optimized for 64-bit applications on processors with Intel SSE4.1 ... >>...So, according to the ippCpuInfo app, I should be selecting cpu_y8 code for my condition (AVX cpu but no AVX os)... This is the right decision to use as highest as possible Intel Instruction Set ( as a workaround ) in your situation.
0 Kudos
Reply