oneapi C++ compiler 2023.2.0 causes crash due to misaligned memory access (test case attached)

mborn · ‎09-20-2023

Hello,

This is a bug report for the 2023.2.0 C++ compiler on MS-Windows.

In the attached test case, code is generated which uses the MOVAPS instruction, but applies it to addresses which are not aligned to a 16-byte boundary:

This leads to a general-protection exception (#GP):

(The error message is not very helpful, as no attempt was made to access address 0xFFFFFFFFFFFFFFFF.)

Best Regards,
Mathias

NoorjahanSk_Intel · ‎09-25-2023

Hi,

Thanks for posting in Intel Communities.

Could you please try adding the below line in your structure klass?

This constructor will initialize a and b members correctly.

>> Klass() : a(0), b(0) {}

We were also able to observe exceptions but we resolved it by adding the above line in struct.

Our Environment Details:

Visual studio 2022 v17.6.2

oneAPI C++ compiler 2023.2

Windows 11.

Please refer to the below screenshot for more details:

If this does not resolve your issue, please let us know how you are checking that the addresses are not aligned to 16-byte boundary.

Thanks & Regards,

Noorjahan.

mborn · ‎09-25-2023

Thanks for looking at this.

However, I disagree with your reasoning. C++ doesn't mandate that I add a constructor. In my test case, the problem is triggered by

matrix.fill({});

which doesn't require "matrix" to be initialized before.

I can see that the addresses are not aligned properly by looking at their values in the debugger. You can see that in my first screenshot. The RSP register contained a value that was 16-byte aligned (not shown), thus RSP+0x28 is not.

If the compiler chooses to emit instructions that require 16-byte alignment then it is the compiler's responsibility to ensure this condition is met. This burden must not be put on the compiler user.

When I do add the constructor, the compiler falls back to x87 instructionsm which is what makes it work again:

Thus I maintain this is a compiler bug.

Best Regards,
Mathias

NoorjahanSk_Intel · ‎10-04-2023

Hi,

Thanks for providing the details.

We are also able to reproduce the issue at our end. We have reported this issue to the concerned development team. They are looking into your issue.

Thanks & Regards,

Noorjahan.

NoorjahanSk_Intel · ‎10-24-2023

Hi,

Thanks for your patience.

The code provided by you is using 80 bit precision long double functionality which is not supported by ICX compiler.

Microsoft STL with recent language standards (i.e. C++20 and newer) does not support fp80.

Please refer to below links for more details:

https://www.intel.com/content/www/us/en/developer/articles/technical/known-limitations-with-qlong-double-on-windows.html

https://learn.microsoft.com/en-us/cpp/build/x64-software-conventions?view=msvc-170#scalar-types

https://www.intel.com/content/dam/develop/external/us/en/documents/oneapi_dpcpp_cpp_compiler.pdf.

Intel intend to support the latest 128-bit float standards. You can use double instead of long double with /Qlong-double to make the code work.

Thanks & Regards,

Noorjahan.

mborn · ‎10-24-2023

Dear @NoorjahanSk_Intel ,

I am not convinced. You say

"The code provided by you is using 80 bit precision long double functionality which is not supported by ICX compiler."

but then provide a link to the "Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference" which says on page 286:

"Windows OS:
/Qlong-double

...

Description
This option changes the default size of the long double data type to 80 bits.
However, the alignment requirement of the data type is 16 bytes, and its size must be a multiple of its
alignment, so the size of a long double on Windows* is also 16 bytes. Only the lower 10 bytes (80 bits) of
the 16 byte space will have valid data stored in it."

The 80bit extended precision support is the only reason I'm even using the Intel compiler. The moment this gets dropped, I'm out. Intel-CPUs can't do 128bit floats in hardware, so that will be terribly slow.

I'm aware of the known limitations, they don't affect me. So far, 80bit long double has worked perfectly well, up until compiler version 2023.2, which introduced the very bug which this conversation is all about. It isn't even related to floating point math, just to how data is moved.

Best Regards,
Mathias

NoorjahanSk_Intel · ‎11-08-2023

Hi,

Apologies for the confusion.

We are working on your issue. We will get back to you soon.

Meanwhile you can explicitly set alignment for std::array<Klass<F>, 16> matrix; as a workaround.

alignas(16) std::array<Klass<F>, 16> matrix;

alignas(16) std::array<Klass<F>, 4> rhs;

Thanks & Regards,

Noorjahan.

Alex_Y_Intel · ‎01-31-2024

Hi, @mborn, thank you for posting your question here.

Have you tried the most recent version 2024.0 compiler to see if the problem still exists?
Can you also please post the exact commands that you used to see the issue?

mborn · ‎01-31-2024

No I have not tried yet. 2024.1 is the earliest version I'll try, because this is supposed to fix yet another bug (different post).

You guys are amazing. I do all the work, investigate the bugs caused by your compiler, produce a minimal test case, then have to waste time on pointless discussions right here (see responses to my original post!), and now you want me to tell you whether you have managed to fix your own bugs?

You can find all files needed to run the test case attached to my original post.

Best Regards,
Mathias

oneapi C++ compiler 2023.2.0 causes crash due to misaligned memory access (test case attached)

Runtime error