Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

ICC 2017 Bug - Bad ordering of instructions

Pritchard__Ben
Beginner
604 Views

There seems to be a bug in Intel2017. Given the following code, compiled with "icc -std=c99 -O3 *.c" (with an Ivy Bridge processor):

// File: accumulate.c
void accumulate(int * offsets,
                double const * const restrict input,
                double * const restrict output)
{
    output[0] += input[0]; // first offset is always zero
    for(int i = 1; i < 4; i++)
        output[offsets] += input;
}

 

// File: main.c
#include <stdio.h>

void accumulate(int * offsets,
                double const * const restrict input,
                double * const restrict output);


int main(void)
{
    int offsets[4] = {0, 0, 1, 1};
    double input[4] = {1.0, 2.0, 3.0, 4.0};
    double output[4] = {0.0, 0.0, 0.0, 0.0};

    accumulate(offsets, input, output);
    printf("Results: %12.6e %12.6e %12.6e %12.6e\n",
           output[0], output[1], output[2], output[3]);

    return 0;
}

 

The resulting output is "1.000000e+00 7.000000e+00 0.000000e+00 0.000000e+00". The first value is incorrect. It is correct with Intel2016, or if I remove the restrict keywords.

In looking at the disassembled code for accumulate(), the compiler doesn't seem to realise that output[0] and output[offset[...]] can be the same memory location. It reorders the instructions so that one actually overwrites the value of the other, as opposed to letting output[0] always go first.

movsxd rax,DWORD PTR [rdi+0x4]         <---- rax = 0 in this case
movsxd rcx,DWORD PTR [rdi+0x8]
movsxd r8,DWORD PTR [rdi+0xc]
movsd  xmm1,QWORD PTR [rdx+rax*8]      <----
movsd  xmm0,QWORD PTR [rdx]            <---- [rdx] and [rdx+rax*8] are the same!
addsd  xmm1,QWORD PTR [rsi+0x8]
addsd  xmm0,QWORD PTR [rsi]
movsd  QWORD PTR [rdx+rax*8],xmm1      <---- puts results in output[0]      
movsd  xmm2,QWORD PTR [rdx+rcx*8]
movsd  QWORD PTR [rdx],xmm0            <---- overwrites [rdx+rax*8]
addsd  xmm2,QWORD PTR [rsi+0x10]
movsd  QWORD PTR [rdx+rcx*8],xmm2
movsd  xmm3,QWORD PTR [rdx+r8*8]
addsd  xmm3,QWORD PTR [rsi+0x18]
movsd  QWORD PTR [rdx+r8*8],xmm3
ret    

 

0 Kudos
2 Replies
Judith_W_Intel
Employee
604 Views

 

Hi,

I was able to reproduce the problem with our current development compiler and have submitted a defect in our internal bug tracking database under DPD200414377.

Thanks for reporting it and including a nice small example.

Judy

 

0 Kudos
Raymond_S_1
Beginner
604 Views

Hi,

I have encountered a similar bug, which manifests in ICPC 17.0.0 without the use of the "restrict" keyword. I don't know if the underlying cause is the same, so I'm providing the most minimal example that I was able to produce:

// Foo.h
#include <stddef.h>

class Foo
{
public:
    Foo(size_t rxn, size_t ic0, size_t ic1)
        : m_rxn(rxn), m_ic0(ic0), m_ic1(ic1) {}

    void increment(const double* R, double* S) const;

private:
    size_t m_rxn;
    size_t m_ic0, m_ic1;
};

// Foo.cpp
#include "Foo.h"
#include <string>

void Foo::increment(const double* R, double* S) const {
    std::string bar; // removing this line eliminates the error
    S[m_ic0] += R[m_rxn];
    S[m_ic1] += R[m_rxn];
}

// main.cpp
#include "Foo.h"
#include <vector>
#include <iostream>

int main(int argc, char** argv)
{
    Foo bar(0, 0, 0);
    std::vector<double> R(1, 0.1234);
    std::vector<double> S(1, 0.0);
    bar.increment(R.data(), S.data());

    // Output should be "0.1234, 0.2468"
    // Actual output with "icpc (ICC) 17.0.0 20160721" is "0.1234, 0.1234"
    std::cout << R[0] << ", " << S[0] << std::endl;
    return 0;
}

I found that any of the following were sufficient to eliminate the bug from this example:

  • Moving the definition of "increment" into the header file
  • Removing the declaration of the string "bar" from the "increment" function
  • Writing "increment" as a free function that takes the indices as additional arguments
  • Compiling with an older version of the compiler (e.g. 15.0)
0 Kudos
Reply