Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7953 Discussions

/QaxAVX,CORE-AVX2 generates incorrect code for AVX2

meluzin__vojtech
Beginner
474 Views

I'm currently fighting hard with the newest version of ICL. First the profile generated code crashes in Intel stuff (https://software.intel.com/en-us/forums/intel-c-compiler/topic/760787#comment-1920417) and now this:

- If I compile /fp:fast /OxAVX2, everything is fast, the executable is huge and I cannot make it smaller using profile based build (see the other post). And it runs only on AVX2 CPUs.

If I compile /fp:fast /OxSSE2 /OaxAVX, everything is fast, less but still

- If I compile /fp:fast /OxSSE2 /OaxAVX,CORE-AVX2, it's superfast, actually faster than /OxAVX2 :), that itself is weird, and well, it doesn't work - some calculations just result in some nonsense, in the superhuge code I cannot really post any "minimum example" or anything.

- If I compile /fp:precise /OxSSE2 /OaxAVX,CORE-AVX2, it gets superslow and huge, but works :).

It's pretty obvious that some optimization makes things dysfunctional and since having alternative path to AVX2 is faster than compiling the whole thing directly for AVX2 (albeit not working correctly), something is not working as it should. For the record, it's audio processing, contains lots of vectorizable loops for crossmultiplication of buffers etc.

 

0 Kudos
1 Reply
Viet_H_Intel
Moderator
474 Views

Hi,

It's difficult for us to investigate the problem without having a reproducer. Will you able to provide us one?

Regards,

Viet

 

0 Kudos
Reply