Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Return Value Optimization bug with SSE4.2 and above

Thiago_I_
Beginner
229 Views

Using icc 16.0.3 (is there a newer version I can use?) the following code on Windows shows a compiler bug that results in a wrong transpose being returned:

 

// fails: icl -O2 /arch:SSE4.2 test.cpp
// works: icl -O2 /arch:SSE4.1 test.cpp

#include "stdio.h"

struct Matrix {
   int m[4][4];
};

void printMatrix(const Matrix &m) {
   printf("matrix:\n");
   for (int i = 0; i < 4; ++i) {
      for (int j = 0; j < 4; ++j)
         printf("%2d  ", m.m);
      printf("\n");
   }
}

__declspec (noinline) Matrix transpose(const Matrix& a)
{
   Matrix b;
   for (int i = 0; i < 4; i++)
      for (int j = 0; j < 4; j++)
         b.m = a.m;
   return b;
}

int main()
{
   Matrix m = { 1, 2,  3 , 4,
                5, 6,  7,  8,
                9, 10, 11, 12,
                13, 14, 15, 16 };
   printMatrix(m);
   m = transpose(m);
   printMatrix(m);
   return 0;
}

 

0 Kudos
1 Reply
Anoop_M_Intel
Employee
229 Views

I can reproduce this issue and have escalated this for compiler engineer's attention. Thanks for reporting this bug.

Regards
Anoop

0 Kudos
Reply