Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*

Compiler error when optimising loop

ChristopherBell
Beginner
786 Views

I'm using the 2023 OneAPI C++ compiler to rebuild some existing software that we have previously built with the 2019 Intel compiler.  I am seeing an error in a loop which "expands" an array of 4 byte integers to become 8 byte integers in the same storage space. 

Qualitatively it works like this:

  • Declare an array long enough to hold N 8 byte (long long) integers
  • Read N 4 byte (int) integers into the bottom half of this
  • Working from the end of the array backwards copy each 4 byte integer into the appropriate 8 byte slot by array index.

The coding that matters could be written:

long long *buff_8;               // Dynamically allocated space to hold N long longs

int   *buff_4 = (int *)buff_8;   // Interpret this space as 4 byte integers

 

... populate buff_4[] with an array of N 4 byte integer values

 

for(i=N-1; i>=0; i--) buff_8[i] = buff_4[i];     // Copy the numbers to long long format

 

I have attached two very simple files which show the problem:

  • main.c sets up, populates and tests the storage
  • sub.c performs the I4 => I8 copy

To reproduce this just create a console C project of these two files, nothing fancy required, use a Release build to turn on optimisation (I used /O2) and turn off interprocedural optimisation.


Working on Windows (in VS2022) if I perform the copy in a separate function in a separate file and  turn interprocedural optimisation to multiifile (/Qipo) this works fine, if I turn it off it fails and I get zeros.  Interestingly if I use Intel 2019 compilers the opposite is true: it fails when using /Qipo but works without.

I also tried on Linux where I got different behaviour: the Intel 2019 compiler failed when I wrote this as a single function, but worked if I put a #pragma novector before the copy loop, the OneAPI compiler has - so far - worked OK. 

 

We see this problem in the 2024 compiler as well, haven't tried 2025.

 

We can solve the problem by isolating that function and using  #pragma optimize ("", off) to stop it being optimised, so it is not a show-stopper.  But I think it is quite a serious compiler bug.

0 Kudos
1 Solution
ChristopherBell
Beginner
675 Views
Thanks a bunch for that. I have never heard about strict aliasing before, I just assumed that memory contents were mine to interpret as I pleased.

I have loads of mixed float and int arrays as well, which I see contravene that rule. I'm going to need that compiler flag!

View solution in original post

0 Kudos
2 Replies
yzh_intel
Moderator
685 Views

Hi, icpx defaults to O2 optimization level, and it assumes code doesn't violate strict aliasing rule. In your code, buff_8 and buff_4 point to the same address, which violates this rule. This issue can be avoided with option "-fno-strict-aliasing". 

0 Kudos
ChristopherBell
Beginner
676 Views
Thanks a bunch for that. I have never heard about strict aliasing before, I just assumed that memory contents were mine to interpret as I pleased.

I have loads of mixed float and int arrays as well, which I see contravene that rule. I'm going to need that compiler flag!
0 Kudos
Reply