Community
cancel
Showing results for 
Search instead for 
Did you mean: 
KNN
Beginner
95 Views

Strange Data Appears in YMM2 During Interrupt

Hello,

I've not seen this documented anywhere, but when an interrupt fires in an application I'm developing and AVX registers are manually moved onto the stack via an instruction like vmovdqu in the interrupt service routine, mysterious data appears in YMM2 overwriting mine. It looks like an address that points somewhere into my loaded file** is stored in the upper packed quadword (this is running bare-metal on an i7-7700HQ) and a random "6" is in the lowest packed quadword. Does anyone know what this data is?

I'm specifically referring to the line pointed to by the red arrow in the attached image. This happens both in Microsoft's Hyper-V and running directly on the CPU with no hypervisor. I know there isn't an issue with my code because YMM0-1 and YMM3-7 all have the same data, and I would have expected this to be true of YMM2. However, it seems YMM2 is overriding what I put in it with whatever this is. 

(And yes, I do know the correct way to save AVX registers for interrupts is via XSAVE, and I am in the process of changing the code over to use XSAVE. I was just wondering if anyone knows what is going on here. It also does this with XMM2 when using SSE instead of AVX.)

**In the memory map at the top of the screenshot, the EfiLoaderData region contains my application file.

EDIT: This is the C code I'm using to trigger the interrupt--note that there is nothing inbetween the asm statements and the forced division error. The divide-by-zero interrupt handler just pushes the general registers on top of the interrupt frame, subtracts the stack pointer to account for the size of the AVX registers, and uses vmovdqu to move the AVX registers into the stack memory area:

  __m256i_u whaty = _mm256_set1_epi32(0x17);
  asm volatile("vmovdqu %[what], %%ymm1" : : [what] "m" (whaty) :);
  asm volatile("vmovdqu %[what], %%ymm2" : : [what] "m" (whaty) :); // Odd behavior with YMM2. The rest are fine.
  asm volatile("vmovdqu %[what], %%ymm3" : : [what] "m" (whaty) :); 
  asm volatile("vmovdqu %[what], %%ymm4" : : [what] "m" (whaty) :); 
  asm volatile("vmovdqu %[what], %%ymm5" : : [what] "m" (whaty) :);
  asm volatile("vmovdqu %[what], %%ymm6" : : [what] "m" (whaty) :);
  asm volatile("vmovdqu %[what], %%ymm7" : : [what] "m" (whaty) :);

  volatile uint64_t c = cs / (cs >> 10); // cs is just a value that will guarantee a divide by zero error

0 Kudos
1 Reply
KNN
Beginner
95 Views

So I haven't made any progress figuring out what exactly those values were all about, but I did discover that the endianness in the printed output was mixed up: the lower quadword in the YMM2.PNG image should actually be the uppermost quadword and vice versa.

That being stated, I have verified that there is no such issue when using XSAVE (*phew*), as seen in the attached image. The YMM registers were filled like this:
 

  __m256i_u whaty = _mm256_set1_epi32(0x17181920);
  __m256i_u what2 = _mm256_set1_epi64x(0x1718192011223344);
  __m256i_u what3 = _mm256_set1_epi32(0x18);
  __m256i_u what9 = _mm256_set1_epi32(0x180019);

  asm volatile("vmovdqu %[what], %%ymm1" : : [what] "m" (whaty) :);
  asm volatile("vmovdqu %[what], %%ymm2" : : [what] "m" (what2) :); 
  asm volatile("vmovdqu %[what], %%ymm3" : : [what] "m" (what3) :); 
  asm volatile("vmovdqu %[what], %%ymm4" : : [what] "m" (whaty) :); 
  asm volatile("vmovdqu %[what], %%ymm5" : : [what] "m" (whaty) :);
  asm volatile("vmovdqu %[what], %%ymm6" : : [what] "m" (whaty) :);
  asm volatile("vmovdqu %[what], %%ymm7" : : [what] "m" (whaty) :);

  asm volatile("vmovdqu %[what], %%ymm15" : : [what] "m" (what9) :);

  volatile __m256i output = _mm256_bsrli_epi128(what2, 1); // To verify the quadword order is correct

  volatile uint64_t c = cs / (cs >> 10); // cs is just a value that will guarantee a divide by zero error

This should probably be filed as a bug report in Clang/LLVM, since using __attribute__((interrupt)) currently causes Clang to emit an interrupt handler that moves (with vmovaps) AVX registers onto the stack instead of using XSAVE...

Reply