IPP causes invalid opcode exception at h9_ippsFFTGetSize_C_32fc

BFalk · ‎06-25-2013

We are using IPP version 7.1.1.119 on 4th generation (Haswell) Core i7 processor under INtime (5) operating system.

We are using static linkage (#include <ipp_h9.h> before #include <ipp.h>).

A call to ippsFFTInitAlloc_C_32fc causes an invalid opcode exception. This occurs inside h9_ippsFFTGetSize_C_32fc function when trying to execute the les esp,edx instruction.

Note: When configuring IPP for AVX rather than AVX2 (using ipp_g9.h instead of ipp_h9.h) - everything works correctly. It so happens that g9_ippsFFTGetSize_C_32fc does not compries that les instruction.

We verified that our processor supports AVX2 (ran the piece of code suggested by Intel for checking this).

Please advise.

Thanks,

Beni Falk

BFalk · ‎06-25-2013

http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions says:

"The new instructions are encoded using what Intel calls a VEX prefix, which is a two- or three-byte prefix designed to clean up the complexity of current and future x86/x64 instruction encoding. The two new VEX prefixes are formed from two obsolete 32-bit instructions-Load Pointer Using DS (LDS-0xC4, 3-byte form) and Load Pointer Using ES (LES-0xC5, two-byte form)-which load the DS and ES segment registers in 32-bit mode. In 64-bit mode, opcodes LDS and LES generate an invalid-opcode exception, but under Intel® AVX, these opcodes are repurposed for encoding new instruction prefixes. As a result, the VEX instructions can only be used when running in 64-bit mode. The prefixes allow encoding more registers than previous x86 instructions and are required for accessing the new 256-bit SIMD registers or using the three- and four-operand syntax. As a user, you do not need to worry about this (unless you're writing assemblers or disassemblers)."

I have the following questions:

1. I am currently compiling and running my code in 32-bit mode. Does it mean the I cannot profitably use IPP on AVX and AVX2?

2. As I wrote, when I have configured IPP for using AVX (rather than AVX2) the problem did not occur and everything seemed to work correctly. Given the above statement at Intel's site, how could it work? Or does IPP somehow switch the processor to 64 bit mode before performing the operation and switches it back afterwards? Please excuse me in advance if this is a dumb question.

Thanks,

Beni Falk

bronxzv · ‎06-25-2013

Beni F. wrote:
As a result, the VEX instructions can only be used when running in 64-bit mode.

this paper is wrong about that (*), you can use AVX and AVX2 in both 32-bit and 64-bit modes, it's working in front of me as I type this text

* I signaled it here http://software.intel.com/en-us/forums/topic/279901 18 months ago, but for some reason it's still not fixed

Bernard · ‎06-25-2013

It is very strange that this error was not corrected.

Bernard · ‎06-25-2013

As bronxzv said you can use both AVX and AVX2 instructions set in protected mode and in long mode.Bear in mind that in 64-bit mode you have additional 8 YMMn registers and 8 gp 64-bit registers more at your disposal.

BFalk · ‎06-25-2013

My problem is that I am using AVX2 via IPP (rather via manually crafted assembly code) and IPP crashes (at least while I am working in 32-bit mode).

Is there a way to work aroung this issue? Do Intel plan to issue a fix for IPP to address it, or does IPP in AVX2 mode mandate 64-bit mode (now and forever)?

Note: our problem occurred when trying to use FFT functions in IPP. I presume that some AVX2 instructions are available in 32-bit mode and some (the ones using VEX prefix) aren't. It is also logical to suppose that there are performance benefits to using some VEX instructions in conjunction with FFT (or else IPP wouldn't use them). Is it such a significant performance boost that Intel would not support using IPP FFT functions in 32-bit mode?

In my opinion the best approach would be for Intel to support both kinds of usage. Just my two cents.

Bernard · ‎06-26-2013

Can you somehow identify that instruction?Maybe with the help of debugger.

bronxzv · ‎06-26-2013

Beni F. wrote:
I presume that some AVX2 instructions are available in 32-bit mode and some (the ones using VEX prefix) aren't.

this is an erroneous assumption, as already explained the paper at your link is plain wrong about that and unfortunately, as you prove it here, very confusing for newcomers to AVX

btw what you describe looks much like a potential bug in IPP, I'll suggest to report it on the dedicated forum

BFalk · ‎06-26-2013

@iliyapolak: as I wrote in my original post, the debugger shows the offending instruction as: les esp,edx

@bronzxv:

1. I also approached TenAsys (the vendor of the INtime operating system) with my problem and they wrote to me that AVX2 instructions that use the VEX prefix cannot execute in 32-bit mode. Seems that I am not the only one who got confused.

2. If the VEX instructions can in fact execute in 32-bit mode, why do I get an invalid opcode exception when hitting such an instruction in h9_ippsFFTGetSize_C_32fc?

3. If, as you say, this is a bug in IPP, where can I report it?

Thanks,

Beni Falk

bronxzv · ‎06-26-2013

Beni F. wrote:
3. If, as you say, this is a bug in IPP, where can I report it?

I said that it looks like a potential bug, you can report it here: http://software.intel.com/en-us/forums/intel-integrated-performance-primitives

BFalk · ‎06-26-2013

OK, thanks.

Bernard · ‎06-26-2013

>>>@iliyapolak: as I wrote in my original post, the debugger shows the offending instruction as: les esp,edx>>>

sorry have not seen that.

Bernard · ‎06-26-2013

>>>les esp,edx>>>

Afaik les instruction was used to set up far pointers

BFalk · ‎06-26-2013

ilyapolak - please see my post second from the top of this thread. I quoted there from an Intel site where they explain about the VEX prefix instructions.

bronxzv · ‎06-26-2013

Beni F. wrote:
@iliyapolak: as I wrote in my original post, the debugger shows the offending instruction as: les esp,edx

are you sure that your debugger has proper support for AVX2 instructions? it may be a legitimate crash due to an AVX2 instruction (for example an instruction that your CPU doesn't support, case in point TSX instuctions on K series CPUs) but the debugger is wrongly reporting it as a legacy LES ?

EDIT: can you tell us the value of the byte right after the leading 0xc5 of this "LES" ?

Bernard · ‎06-26-2013

Yes I have read your post.I cannot understand if invalid opcode exception was thrown by the processor when les esp,edx sequence was decoded or it was thrown during decoding some AVX instruction which encodes VEX prefix with the help of les hex value.In first case compiler could be responsible for the fault.

Bernard · ‎06-26-2013

>>>are you sure that your debugger supports AVX2 instructions?>>>

Interesting question.IIRC invalid opcode vector is 0x6 and cpu should prepare a trap frame where it saves an address of faulty instruction.

Itzhak_B_ · ‎06-26-2013

bronxzv wrote:

EDIT: can you tell us the value of the byte right after the leading 0xc5 of this "LES" ?

iliyapolak wrote:

Yes I have read your post.I cannot understand if invalid opcode exception was thrown by the processor when les esp,edx sequence was decoded or it was thrown during decoding some AVX instruction which encodes VEX prefix with the help of les hex value.In first case compiler could be responsible for the fault.

This is capture of screen whe exception occured.

Bernard · ‎06-27-2013

Itzhak B. wrote:

Quote:

bronxzvwrote:
EDIT: can you tell us the value of the byte right after the leading 0xc5 of this "LES" ?

Quote:

iliyapolakwrote:
Yes I have read your post.I cannot understand if invalid opcode exception was thrown by the processor when les esp,edx sequence was decoded or it was thrown during decoding some AVX instruction which encodes VEX prefix with the help of les hex value.In first case compiler could be responsible for the fault.

This is capture of screen whe exception occured.

Sorry I can not see any attached screenshot.

BFalk · ‎06-27-2013

We are using Microsoft Visual Studio 2008 SP1. It is very likely not aware of the VEX instructions.

Instruction stream bytes where the invalid opcode exception occurred are the following: c4 e2 51 f7 d0 8d 0c d5 47 00 00 00 ... (I don't know where this instruction ends).

About the possibility of our CPU not supporting AVX2 instructions - we ran the code defined at

https://software.intel.com/sites/default/files/319433-014.pdf section 2.2.3 and it ran successfully.

Thanks,

Beni

bronxzv · ‎06-27-2013

Beni F. wrote:
Instruction stream bytes where the invalid opcode exception occurred are the following: c4 e2 51 f7 d0 8d 0c d5 47 00 00 00 ... (I don't know where this instruction ends).

thanks, this looks like a 3-byte prefix VEX encoded instruction, I'll try to understand which one it is

btw I see that LES opcode is 0xc4, not 0xc5 as mentioned in the paper, one more error in this damned paper