Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
777 Discussions

I'm getting an error with an avx2 instruction

Alittle_
Beginner
760 Views

ENV:

ubuntu20.04;

cpu:12th Gen Intel(R) Core(TM) i9-12900K;

gcc:(Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0.

 

I wrote the following code:

 

 

#include <immintrin.h>
#include <iostream>

int main() {


  float test[8] = {18.0, 17.0, 16.0, 15.0, 14.0, 13.0, 12.0, 1.0};

   // __m256 a  = _mm256_load_ps(test);
   __m256 a = _mm256_loadu_ps(test);
   std::cout << "load finish" << std::endl;
    __m256 result = _mm256_max_ps(a, a);

    float temp[8] = {0, 0, 0, 0, 0, 0, 0, 0};
    std::cout << "==" << std::endl;
    _mm256_store_ps(temp, result);
    for(int t = 0; t < 8; ++t)
    {
       std::cout << temp[t] << std::endl;
    }
    std::cout << "======" << std::endl;
  return 0;
}

 

 

At this time, it will report an error in the _mm256_store_ps, why is it, if you replace the _mm256_store_ps with_mm256_storeu_ ps, it's normal, or I add alignas(32) in front of float temp[8], and it's normal to use _mm256_store_ps, but I print the temp address, which is the same as the unaligned address.

Alittle__1-1721978017394.png

 

 

I debugged via gdb and found that when executing vmovaps %ymm0, (%rax) throws an exception, but the contents of ymm0 have already been copied to the rbp register, why should rax be copied to ymm0 at this point, why should rax be copied to ymm0 at this time, and it will crash 

0 Kudos
2 Replies
Alex_Y_Intel
Moderator
617 Views

Why are you reporting an issue with GCC compiler in Intel compiler forum??? 

P.S. if you use Intel compiler there's no such issue. 

0 Kudos
Alittle_
Beginner
554 Views

Okay, I got it
Maybe I'm asking the wrong question

0 Kudos
Reply