Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
22 Views

Calling convention (codegen) bug when returning packed struct

Hello

I recently came across what seems to be a bug when compiling the following code (designed to reproduce the bug):

intel_test.hpp:

#pragma once

#include <cstdint>

#pragma pack(push, 1)
struct packed_struct {
    std::uint32_t uint_val;
    std::uint8_t byte_val;
};
#pragma pack(pop)

packed_struct create_packed_struct(std::uint8_t byte_val);

intel_test.cpp:

#include "intel_test.hpp"

packed_struct create_packed_struct(std::uint8_t byte_val) {
    packed_struct result;
    result.uint_val = 0xbaadf00d;
    result.byte_val = byte_val;
    return result;
}

main.cpp:

#include <iostream>

#include "intel_test.hpp"

int main() {
    const auto packed_struct_val = create_packed_struct(0);
    std::cout << "packed struct: uint_val = " << std::hex << packed_struct_val.uint_val << ", byte_val = " << int{packed_struct_val.byte_val} << std::endl;
}

This code should result in the output:
"packed struct: uint_val = baadf00d, byte_val = 0"
which is what happens when compiling with Visual Studio 2015 or Intel Compiler 15 in release mode or Intel Compiler 17 in debug mode.

However when compiling with Intel Compiler 17 in release mode I get the following output:
"packed struct: uint_val = 3fb636a4, byte_val = 1".

When I viewed the disassembly I found that create_packed_struct() returns the whole struct packed into the RAX register:

mov         rax,0BAADF00Dh  
movzx       r8d,dl  
shl         r8,20h  
or          rax,r8  
ret  

while the calling code expects the result to be written to memory on the stack pointed to by the RCX register:

xor         edx,edx  
lea         rcx,[rbp+10h]  
call        create_packed_struct (013F4B1000h)  
mov         dl,byte ptr [rbp+14h]  
mov         eax,dword ptr [rbp+10h]  
mov         byte ptr [rbp+24h],dl  

And since it overwrites RAX after create_packed_struct() returns, the result is always whatever garbage was previously at RBP+10h.

Further testing showed that removing the "#pragma pack" directives fixes the problem (the caller correctly reads the result from RAX).
Adding the __regcall calling convention specifier to the declaration of create_packed_struct() also fixes the problem (the caller correctly reads the result from RAX).

For reference:

release compiler flags: /MP /GS /Zc:rvalueCast /W4 /QxCORE-AVX2 /Gy /Zc:wchar_t /Zi /O2 /Ob2 /GF /Zc:forScope /GR /arch:CORE-AVX2 /Oi /MD /EHsc /nologo /Gw /Zo /Qstd=c++14 /Qvc14
debug compiler flags: /MP /GS /Zc:rvalueCast /W4 /Gy /Zc:wchar_t /Zi /Od /Zc:forScope /RTC1 /GR /MDd /EHsc /nologo /Gw /Zo /Qstd=c++14 /Qvc14
linker flags: /MANIFEST /NXCOMPAT /DYNAMICBASE /DEBUG /MACHINE:X64 /OPT:REF /qnoipo /INCREMENTAL:NO /SUBSYSTEM:CONSOLE /OPT:ICF /NOLOGO /TLBID:1

 

0 Kudos
3 Replies
Highlighted
Employee
22 Views

Hi, Andrey

I have reproduced the issue you reported. I am still root-causing the issue and will let you know when I have an update.

Thanks.

0 Kudos
Highlighted
22 Views

Good diagnosis.

You must be a vegetarian (0xbaadf00d as opposed to 0xbaadbeef). The first time I've seen that.

Jim Dempsey

0 Kudos
Highlighted
Employee
22 Views

Hi, Andrey

Thank you for reporting the issue with a reproducer. I have escalated the issue and recorded it into our problem tracking system. I will let you know when I have an update on this problem.

In future, you may report compiler bug through our cloud service at:

www.intel.com/supporttickets

Thanks.

0 Kudos