Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.
Announcements
Intel Customer Support will be observing the Martin Luther King holiday on Monday, Jan. 17, and will return on Tues. Jan. 18.
For the latest information on Intel’s response to the Log4j/Log4Shell vulnerability, please see Intel-SA-00646
7590 Discussions

ICC 2017 Bug - Bad ordering of instructions

Pritchard__Ben
Beginner
359 Views

There seems to be a bug in Intel2017. Given the following code, compiled with "icc -std=c99 -O3 *.c" (with an Ivy Bridge processor):

// File: accumulate.c
void accumulate(int * offsets,
                double const * const restrict input,
                double * const restrict output)
{
    output[0] += input[0]; // first offset is always zero
    for(int i = 1; i < 4; i++)
        output[offsets] += input;
}

 

// File: main.c
#include <stdio.h>

void accumulate(int * offsets,
                double const * const restrict input,
                double * const restrict output);


int main(void)
{
    int offsets[4] = {0, 0, 1, 1};
    double input[4] = {1.0, 2.0, 3.0, 4.0};
    double output[4] = {0.0, 0.0, 0.0, 0.0};

    accumulate(offsets, input, output);
    printf("Results: %12.6e %12.6e %12.6e %12.6e\n",
           output[0], output[1], output[2], output[3]);

    return 0;
}

 

The resulting output is "1.000000e+00 7.000000e+00 0.000000e+00 0.000000e+00". The first value is incorrect. It is correct with Intel2016, or if I remove the restrict keywords.

In looking at the disassembled code for accumulate(), the compiler doesn't seem to realise that output[0] and output[offset[...]] can be the same memory location. It reorders the instructions so that one actually overwrites the value of the other, as opposed to letting output[0] always go first.

movsxd rax,DWORD PTR [rdi+0x4]         <---- rax = 0 in this case
movsxd rcx,DWORD PTR [rdi+0x8]
movsxd r8,DWORD PTR [rdi+0xc]
movsd  xmm1,QWORD PTR [rdx+rax*8]      <----
movsd  xmm0,QWORD PTR [rdx]            <---- [rdx] and [rdx+rax*8] are the same!
addsd  xmm1,QWORD PTR [rsi+0x8]
addsd  xmm0,QWORD PTR [rsi]
movsd  QWORD PTR [rdx+rax*8],xmm1      <---- puts results in output[0]      
movsd  xmm2,QWORD PTR [rdx+rcx*8]
movsd  QWORD PTR [rdx],xmm0            <---- overwrites [rdx+rax*8]
addsd  xmm2,QWORD PTR [rsi+0x10]
movsd  QWORD PTR [rdx+rcx*8],xmm2
movsd  xmm3,QWORD PTR [rdx+r8*8]
addsd  xmm3,QWORD PTR [rsi+0x18]
movsd  QWORD PTR [rdx+r8*8],xmm3
ret    

 

0 Kudos
2 Replies
Judith_W_Intel
Employee
359 Views

 

Hi,

I was able to reproduce the problem with our current development compiler and have submitted a defect in our internal bug tracking database under DPD200414377.

Thanks for reporting it and including a nice small example.

Judy

 

Raymond_S_1
Beginner
359 Views

Hi,

I have encountered a similar bug, which manifests in ICPC 17.0.0 without the use of the "restrict" keyword. I don't know if the underlying cause is the same, so I'm providing the most minimal example that I was able to produce:

// Foo.h
#include <stddef.h>

class Foo
{
public:
    Foo(size_t rxn, size_t ic0, size_t ic1)
        : m_rxn(rxn), m_ic0(ic0), m_ic1(ic1) {}

    void increment(const double* R, double* S) const;

private:
    size_t m_rxn;
    size_t m_ic0, m_ic1;
};

// Foo.cpp
#include "Foo.h"
#include <string>

void Foo::increment(const double* R, double* S) const {
    std::string bar; // removing this line eliminates the error
    S[m_ic0] += R[m_rxn];
    S[m_ic1] += R[m_rxn];
}

// main.cpp
#include "Foo.h"
#include <vector>
#include <iostream>

int main(int argc, char** argv)
{
    Foo bar(0, 0, 0);
    std::vector<double> R(1, 0.1234);
    std::vector<double> S(1, 0.0);
    bar.increment(R.data(), S.data());

    // Output should be "0.1234, 0.2468"
    // Actual output with "icpc (ICC) 17.0.0 20160721" is "0.1234, 0.1234"
    std::cout << R[0] << ", " << S[0] << std::endl;
    return 0;
}

I found that any of the following were sufficient to eliminate the bug from this example:

  • Moving the definition of "increment" into the header file
  • Removing the declaration of the string "bar" from the "increment" function
  • Writing "increment" as a free function that takes the indices as additional arguments
  • Compiling with an older version of the compiler (e.g. 15.0)
Reply