32-bit offsets used for array access cause crash

Ize__Thiago · ‎08-21-2019

As shown in https://gcc.godbolt.org/z/_PjVDd icpc seems to be using 32-bit registers to do the offset calculation for an array that has less than 4G elements. If the array had 1B elements, that would be fine, but if it's larger than that, then we need 64-bit registers in order to access past 4GB of memory. The result is that we can get crashes as the repro in the above link shows.

I've confirmed this bug exists in icc 17 all the way to 19.0.245 (the most recent I can find). I've only tested this in Windows.

In case the above link doesn't work, here's the code:

void buggy(unsigned int count, int* a)
{
    // note that the index is 32-bit
    for (unsigned i = 0; i < count; ++i)
        *a++ = 123;
}

void good(unsigned int count, int* a)
{
    // note that the index is 64-bit
    for (unsigned long long i = 0; i < count; ++i)
        *a++ = 123;
}

int main(int argc, char** argv)
{
   const size_t array_size =
     argc == 123456 ? (size_t)argv : (1ULL << 31) + 1000; // prevent compiler optimizations
   printf("array_size == %zu\n", array_size); // double check we really did use the correct size
   int* out = new int[array_size];

   buggy(array_size, out);
   // good(array_size, out);

   printf("no crash!\n");

   return 0;
}

McCalpinJohn · ‎08-22-2019

This looks like an argument mismatch in your source code?

You are calling either "buggy()" or "good()" with a "size_t" variable as the first argument, but both routines declare that variable as "unsigned int". According to https://docs.microsoft.com/en-us/windows/win32/winprog/windows-data-types#size-t, the "size_t" type is unsigned int64 in the Windows 64 environment. I am not a language lawyer, but I think behavior is undefined in this case.

If you are compiling in the WIndows 32-bit environment, then the arguments match, and if there is an error, it is due to the compiler using signed 32-bit arithmetic instead of unsigned 32-bit arithmetic at some point in the process. The compiler will often use instructions operating on 64-bit registers interchangeably with instructions operating on 32-bit registers for portions of the code after any instruction that guarantees that the two are equivalent.

I have seen compiler bugs when dealing with integers between 2^31-1 and 2^32-1, but I recommend starting the process by using exactly the same type declarations in all cases.

ize__thiago1 · ‎08-22-2019

Sorry, I should have clarified this is 64-bit code.

The argument mismatch doesn't matter since (1ULL << 31) + 1000 fits inside of a 32-bit unsigned int, so there's no loss of information when it gets truncated to 32-bits. In any case, I've heard back from Intel that they can reproduce this bug (after adjusting some printfs), so now we just need to wait for it to get fixed.

jimdempseyatthecove · ‎08-22-2019

I seem to recall this as an old bug. It may have been fixed and then came back in.

Jim Dempsey

jimdempseyatthecove · ‎08-22-2019

>>I've heard back from Intel that they can reproduce this bug (after adjusting some printfs), so now we just need to wait for it to get fixed.

I strongly suggest you fix your code to use size_t seeing that the array is allocated using size_t with an allocation size of:

array_size * sizeof(int) == ((1ULL << 31) + 1000) * 4

Should the compiler optimization choose to substitute

a = 123;

for

*a++ = 123;

you will have an issue due to a being indexed by an unsigned 32-bit number (register) scaled with *4.

Note, the compiler likely did the substitution along with vectorizing the code (while erroneously using 32-bit SIB registers).

That is my guess (a disassembly of the code would confirm this).

Jim Dempsey

ize__thiago1 · ‎08-22-2019

You're absolutely right -- the "good" for loop example I gave above which works uses the 64 bit counter and that is indeed what I've been using in my code to circumvent the manifestations of this bug while we wait for the compiler to be fixed.

McCalpinJohn · ‎08-23-2019

I don't know whether it is related or not, but the compiler will often generate very strange code if you compute array indices using combinations of 32-bit signed and 32-bit unsigned values. It looks like the compiler is trying to "properly" handle the cases for which the signed computation wraps from positive to negative, but it is doing it by computing index vectors and then doing gathers, rather than the more straightforward range checking. (My brain refuses to believe that there is a "correct" way to handle this, and that a user would write a code that depended on that interpretation.)

The compiler does not worry about wraparound of 64-bit signed index values.

Assuming that you can pass a 64-bit value and receive it as a 32-bit value is a violation of the language standard, even if it often "works". According to the C99 standard, for example, all declarations that refer to the same object must be the same, otherwise the behavior is undefined.

ize__thiago1 · ‎08-23-2019

You're correct that it's a bug if the function parameter was a reference to an int, but in this case it's not and instead we get a copy by value, so these aren't the same object. Passing an unsigned 64-bit int to a function that takes an unsigned 32-bit int by value is well defined in both C and C++. An implicit conversion (integer truncation in this case) is performed by the compiler. Here's some info for both languages https://en.cppreference.com/w/cpp/language/implicit_conversion and https://en.cppreference.com/w/c/language/conversion.

McCalpinJohn · ‎08-24-2019

Thanks -- it did not click in my head that this was passing by value. I was probably distracted by the assembly code, which inlines the "buggy()" function.