- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There seems to be a bug in Intel2017. Given the following code, compiled with "icc -std=c99 -O3 *.c" (with an Ivy Bridge processor):
// File: accumulate.c void accumulate(int * offsets, double const * const restrict input, double * const restrict output) { output[0] += input[0]; // first offset is always zero for(int i = 1; i < 4; i++) output[offsets] += input; }
// File: main.c #include <stdio.h> void accumulate(int * offsets, double const * const restrict input, double * const restrict output); int main(void) { int offsets[4] = {0, 0, 1, 1}; double input[4] = {1.0, 2.0, 3.0, 4.0}; double output[4] = {0.0, 0.0, 0.0, 0.0}; accumulate(offsets, input, output); printf("Results: %12.6e %12.6e %12.6e %12.6e\n", output[0], output[1], output[2], output[3]); return 0; }
The resulting output is "1.000000e+00 7.000000e+00 0.000000e+00 0.000000e+00". The first value is incorrect. It is correct with Intel2016, or if I remove the restrict keywords.
In looking at the disassembled code for accumulate(), the compiler doesn't seem to realise that output[0] and output[offset[...]] can be the same memory location. It reorders the instructions so that one actually overwrites the value of the other, as opposed to letting output[0] always go first.
movsxd rax,DWORD PTR [rdi+0x4] <---- rax = 0 in this case movsxd rcx,DWORD PTR [rdi+0x8] movsxd r8,DWORD PTR [rdi+0xc] movsd xmm1,QWORD PTR [rdx+rax*8] <---- movsd xmm0,QWORD PTR [rdx] <---- [rdx] and [rdx+rax*8] are the same! addsd xmm1,QWORD PTR [rsi+0x8] addsd xmm0,QWORD PTR [rsi] movsd QWORD PTR [rdx+rax*8],xmm1 <---- puts results in output[0] movsd xmm2,QWORD PTR [rdx+rcx*8] movsd QWORD PTR [rdx],xmm0 <---- overwrites [rdx+rax*8] addsd xmm2,QWORD PTR [rsi+0x10] movsd QWORD PTR [rdx+rcx*8],xmm2 movsd xmm3,QWORD PTR [rdx+r8*8] addsd xmm3,QWORD PTR [rsi+0x18] movsd QWORD PTR [rdx+r8*8],xmm3 ret
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I was able to reproduce the problem with our current development compiler and have submitted a defect in our internal bug tracking database under DPD200414377.
Thanks for reporting it and including a nice small example.
Judy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have encountered a similar bug, which manifests in ICPC 17.0.0 without the use of the "restrict" keyword. I don't know if the underlying cause is the same, so I'm providing the most minimal example that I was able to produce:
// Foo.h #include <stddef.h> class Foo { public: Foo(size_t rxn, size_t ic0, size_t ic1) : m_rxn(rxn), m_ic0(ic0), m_ic1(ic1) {} void increment(const double* R, double* S) const; private: size_t m_rxn; size_t m_ic0, m_ic1; }; // Foo.cpp #include "Foo.h" #include <string> void Foo::increment(const double* R, double* S) const { std::string bar; // removing this line eliminates the error S[m_ic0] += R[m_rxn]; S[m_ic1] += R[m_rxn]; } // main.cpp #include "Foo.h" #include <vector> #include <iostream> int main(int argc, char** argv) { Foo bar(0, 0, 0); std::vector<double> R(1, 0.1234); std::vector<double> S(1, 0.0); bar.increment(R.data(), S.data()); // Output should be "0.1234, 0.2468" // Actual output with "icpc (ICC) 17.0.0 20160721" is "0.1234, 0.1234" std::cout << R[0] << ", " << S[0] << std::endl; return 0; }
I found that any of the following were sufficient to eliminate the bug from this example:
- Moving the definition of "increment" into the header file
- Removing the declaration of the string "bar" from the "increment" function
- Writing "increment" as a free function that takes the indices as additional arguments
- Compiling with an older version of the compiler (e.g. 15.0)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page