Community
cancel
Showing results for 
Search instead for 
Did you mean: 
meluzin__vojtech
Beginner
51 Views

/QaxAVX,CORE-AVX2 generates incorrect code for AVX2

I'm currently fighting hard with the newest version of ICL. First the profile generated code crashes in Intel stuff (https://software.intel.com/en-us/forums/intel-c-compiler/topic/760787#comment-1920417) and now this:

- If I compile /fp:fast /OxAVX2, everything is fast, the executable is huge and I cannot make it smaller using profile based build (see the other post). And it runs only on AVX2 CPUs.

If I compile /fp:fast /OxSSE2 /OaxAVX, everything is fast, less but still

- If I compile /fp:fast /OxSSE2 /OaxAVX,CORE-AVX2, it's superfast, actually faster than /OxAVX2 :), that itself is weird, and well, it doesn't work - some calculations just result in some nonsense, in the superhuge code I cannot really post any "minimum example" or anything.

- If I compile /fp:precise /OxSSE2 /OaxAVX,CORE-AVX2, it gets superslow and huge, but works :).

It's pretty obvious that some optimization makes things dysfunctional and since having alternative path to AVX2 is faster than compiling the whole thing directly for AVX2 (albeit not working correctly), something is not working as it should. For the record, it's audio processing, contains lots of vectorizable loops for crossmultiplication of buffers etc.

 

0 Kudos
1 Reply
Viet_H_Intel
Moderator
51 Views

Hi,

It's difficult for us to investigate the problem without having a reproducer. Will you able to provide us one?

Regards,

Viet