I'm using the 2023 OneAPI C++ compiler to rebuild some existing software that we have previously built with the 2019 Intel compiler. I am seeing an error in a loop which "expands" an array of 4 byte integers to become 8 byte integers in the same storage space.
Qualitatively it works like this:
- Declare an array long enough to hold N 8 byte (long long) integers
- Read N 4 byte (int) integers into the bottom half of this
- Working from the end of the array backwards copy each 4 byte integer into the appropriate 8 byte slot by array index.
The coding that matters could be written:
long long *buff_8; // Dynamically allocated space to hold N long longs
int *buff_4 = (int *)buff_8; // Interpret this space as 4 byte integers
... populate buff_4[] with an array of N 4 byte integer values
for(i=N-1; i>=0; i--) buff_8[i] = buff_4[i]; // Copy the numbers to long long format
I have attached two very simple files which show the problem:
- main.c sets up, populates and tests the storage
- sub.c performs the I4 => I8 copy
To reproduce this just create a console C project of these two files, nothing fancy required, use a Release build to turn on optimisation (I used /O2) and turn off interprocedural optimisation.
Working on Windows (in VS2022) if I perform the copy in a separate function in a separate file and turn interprocedural optimisation to multiifile (/Qipo) this works fine, if I turn it off it fails and I get zeros. Interestingly if I use Intel 2019 compilers the opposite is true: it fails when using /Qipo but works without.
I also tried on Linux where I got different behaviour: the Intel 2019 compiler failed when I wrote this as a single function, but worked if I put a #pragma novector before the copy loop, the OneAPI compiler has - so far - worked OK.
We see this problem in the 2024 compiler as well, haven't tried 2025.
We can solve the problem by isolating that function and using #pragma optimize ("", off) to stop it being optimised, so it is not a show-stopper. But I think it is quite a serious compiler bug.
I have loads of mixed float and int arrays as well, which I see contravene that rule. I'm going to need that compiler flag!
链接已复制
Hi, icpx defaults to O2 optimization level, and it assumes code doesn't violate strict aliasing rule. In your code, buff_8 and buff_4 point to the same address, which violates this rule. This issue can be avoided with option "-fno-strict-aliasing".
I have loads of mixed float and int arrays as well, which I see contravene that rule. I'm going to need that compiler flag!
