- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

```
// Example code in external library:
```

struct Vec { double x, y, z; inline const Vec& operator=(const Vec& that) { x = that.x; y = that.y; z = that.z; return *this; } inline double& operator[](int index) { return (&x)[index]; } } // Example usage in my binary: int index; Vec vec; ... vec = some_other_vec; double result = vec[index];

If the value of index is not known at compile time, compiler reorders instruction incorrectly, for instance like this (assuming index == 2):

vec.x = some_other_vec.x; result = (&vec.x)[2]; // this is loaded before vec.z is stored vec.y = some_other_vec.y; vec.z = some_other_vec.z;

Implementation of operator[] is probably violates strict aliasing rule. But it is in external library and I cannot modify the source code. I decided to build my binary with -fno-strict-aliasing. Operator [] is inlined to my binary, so it should fix it. But it seems, that -fno-strict-aliasing does not work. What flag could I use to disable incorrect instructions reordering? When I compile my code with -O0, everything works correctly.

I use this compiler: icpc version 15.0.1 (gcc version 4.6.0 compatibility), previous version of compiler works ok

Thank you, Pavel Jurkas.

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Can you insert a compiler fence in between the instructions?

If the header is available, then you could add an operator= which includes the fence.

This assumes the error does not appear within the external library.

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I am not able to reproduce the error using the following program (used 15.0,1 + GCC 4.6):

#include<iostream> #include<stdlib.h> struct Vec { double x, y, z; inline Vec() { x = -1.0; y = -2.0; z = -3.0; } inline Vec(double x1, double y1, double z1) { x = x1; y = y1; z = z1; } inline const Vec& operator=(const Vec& that) { x = that.x; y = that.y; z = that.z; return *this; } inline double& operator[](int index) { return (&x)[index]; } }; int main(int argc, char *argv[]){ if(argc != 2) { std::cout<<"Need the index as a command line argument [0, 1 or 2]\n"; return 0; } int index = atoi(argv[1]); Vec v1(1.0, 2.0, 3.0), v2; v2 = v1; double result = (&(v2.x))[index]; std::cout<<"result = "<<result<<"\n"; return 0; }

Please correct me if I am not capturing your problem right.

Thanks and Regards

Anoop

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Could you please try this example code (test.cpp):

#include <iostream> struct Vec { double x, y, z; const Vec& operator=(const Vec& that) { x = that.x; y = that.y; z = that.z; return *this; } double& operator[](int index) { return (&x)[index]; } }; int main() { const int size = 1000; // create 1000 vectors: (0,0,0), (1,1,1), ..., (999,999,999) Vec vec[size]; for (int i = 0; i < size; ++i) vec.x = vec.y = vec.z = i; // get index from user int index; std::cout << "enter index [0, 1, or 2]: "; std::cin >> index; // compute sum of the specified index from all vectors Vec v; double sum = 0.0; for (int i = 0; i < size; ++i) { v = vec; sum += v[index]; } // output result -- it should return always value 499500 std::cout << "result: " << sum << std::endl; }

Some results:

$ g++ -O3 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 499500 $ icpc -O0 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 499500 $ icpc-2013_sp1.1.106 -O1 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 499500 $ icpc-2013_sp1.2.144 -O1 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 499500 $ icpc-2015.1.133 -O1 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 498501 $ icpc-2015.3.187 -O1 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 498501 $ icpc-2016.0.109 -O1 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 498501 $ icpc-2016.3.210 -O1 test.cpp && ./a.out enter index [0, 1, or 2]: 2 result: 0

I tried all versions of icpc, which I have. Versions >= 2015 returned incorrect values.

Memory fences fixes this issue, but i do not want to modify that external code, i would rather disable some optimization, which reoders it incorrectly.

Thank you, Pavel Jurkas

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

One more question: Is that code incorrect or is it a bug in compiler?

Thanks

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I have escalated this problem to compiler engineering for expert advise. In the meantime, as Jim mentioned you can use MFENCE as shown below:

#include <iostream> struct Vec { double x, y, z; const Vec& operator=(const Vec& that) { x = that.x; y = that.y; z = that.z; return *this; } double& operator[](int index) { return (&x)[index]; } }; int main() { const int size = 1000; // create 1000 vectors: (0,0,0), (1,1,1), ..., (999,999,999) Vec vec[size]; for (int i = 0; i < size; ++i) vec.x = vec.y = vec.z = i; // get index from user int index; std::cout << "enter index [0, 1, or 2]: "; std::cin >> index; // compute sum of the specified index from all vectors Vec v; double sum = 0.0; for (int i = 0; i < size; ++i) { v = vec; asm volatile ("mfence" ::: "memory"); sum += v[index]; } // output result -- it should return always value 499500 std::cout << "result: " << sum << std::endl; }

$ icpc test.cc -g -O1 $ ./a.out enter index [0, 1, or 2]: 2 result: 499500 $ g++ test.cc -O3 $ ./a.out enter index [0, 1, or 2]: 2 result: 499500

Thanks and Regards

Anoop

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Anoop,

I think

asm volatile("" ::: "memory"); // compiler fence only (no CPU fence required)

would be sufficient. Could you please verify this?

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thank you, asm volatile("" ::: "memory") works for me.

Is this problem caused by a bug in Intel compiler?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

*>>Is this problem caused by a bug in Intel compiler?*

I think it is a case of a bug in the programmer. The programmer (you) told the compiler "I want speed". An almost universal means (one of) of accomplishing this is by instruction re-order *** provided it does not bypass the visible intent of the programmer expressed in source code. The aliasing used in your code makes the intent NOT visible.

For additional information see: http://preshing.com/20120625/memory-ordering-at-compile-time/

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for a nice explanation. I suspected more that external library code of being incorrect than the compiler. But some of my colleagues were convinced that it is a compiler bug.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I assume you are aware that the double& on the operator[] is producing the alias. If you change the return type to double (no &, and return the value not the reference), the compiler may generate the code you expect (I have not tested this).

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Non-alias version produces the same incorrect result

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page