Community
cancel
Showing results for 
Search instead for 
Did you mean: 
italo1337
Beginner
114 Views

Possible bug with OpenMP and optimizations in version 12.0.2.154

I am having some problems with C++ compiler version 12.0.2.154 (build 20110112). My code works fine on version 11.1.8.070 and on gcc, but I get wrong numeric results on the newest version.

On icl I use /fast /Qopenmp /fp:precise /fp:double.

On gcc I use -O3 -ffast-math -fopenmp.

I noticed that if I remove OpenMP it works fine. It also works if I use /O1.

The only place where I use OpenMP is in the following function:

void calc_curvature(big_vector_t *y, big_vector_t *z) {

int i, d;

#pragma omp parallel for private(i, d)

for ( i = 0 ; i < params->particle_count ; i++ ) {

for ( d = 0 ; d < 3 ; d++ ) {
z.v.a -= calc_dW(y, i, d) / params->particle.m ;
}

}

}

Here, params is a global and calc_dW does not change anything in y.

It is a physical simulation. The summation is the force. In the newer compiler the force is much less than it should be. It is not noticeable in the first cycles, but as the simulation goes on, the results differ a lot.

P.S. This site is very very slow. I get several time-outs and server errors. Are you experiencing problems?


0 Kudos
28 Replies
Om_S_Intel
Employee
95 Views

It would nice if you can help us with small butcomplete test case to investigate the issue.

italo1337
Beginner
95 Views

I found where the problem is. It is another function that uses OpenMP:

void calc_membrane(big_vector_t *y, big_vector_t *z) {

int i, d;
element_t e;
double dW[3][3];

#pragma omp parallel for private(i, d, e, dW)

for ( i = 0 ; i < params->triangle_count ; i++ ) {

e.X[0] = params->reference[params->triangle.node[0]].x;
e.X[1] = params->reference[params->triangle.node[1]].x;
e.X[2] = params->reference[params->triangle.node[2]].x;

e.x[0] = y[params->triangle.node[0]].x;
e.x[1] = y[params->triangle.node[1]].x;
e.x[2] = y[params->triangle.node[2]].x;

for ( d = 0 ; d < 3 ; d++ ) {
dW[0] = calc_dW(e,0,d) / params->particle[params->triangle.node[0]].m ;
dW[1] = calc_dW(e,1,d) / params->particle[params->triangle.node[1]].m ;
dW[2] = calc_dW(e,2,d) / params->particle[params->triangle.node[2]].m ;
}

#pragma omp critical

for ( d = 0 ; d < 3 ; d++ ) {
z[params->triangle.node[0]].v.a -= dW[0] ;
z[params->triangle.node[1]].v.a -= dW[1] ;
z[params->triangle.node[2]].v.a -= dW[2] ;
}

}

}

If I remove these two pragmas, the code works fine. And, as I said before, it also works fine with icl 11.1, gcc or /O1.

I am sorry I can't provide a small complete test case. My code is quite big. The source code is over 100KB.

jimdempseyatthecove
Black Belt
95 Views

Try this:

[cpp]void calc_membrane(big_vector_t *y, big_vector_t *z) {

#pragma omp parallel for

for (int i = 0 ; i < params->triangle_count ; i++ ) {

element_t e;
double dW[3][3];

e.X[0] = params->reference[params->triangle.node[0]].x;
e.X[1] = params->reference[params->triangle.node[1]].x;
e.X[2] = params->reference[params->triangle.node[2]].x;

e.x[0] = y[params->triangle.node[0]].x;
e.x[1] = y[params->triangle.node[1]].x;
e.x[2] = y[params->triangle.node[2]].x;

for (int d = 0 ; d < 3 ; d++ ) {
dW[0] = calc_dW(e,0,d) / params->particle[params->triangle.node[0]].m ;
dW[1] = calc_dW(e,1,d) / params->particle[params->triangle.node[1]].m ;
dW[2] = calc_dW(e,2,d) / params->particle[params->triangle.node[2]].m ;
}

#pragma omp critical
{
for (int d = 0 ; d < 3 ; d++ ) { z[params->triangle.node[0]].v.a -= dW[0] ; z[params->triangle.node[1]].v.a -= dW[1] ; z[params->triangle.node[2]].v.a -= dW[2] ; }
} } } [/cpp]
Jim Dempsey
italo1337
Beginner
95 Views

I get compile errors:

[bash]membrane.c
membrane.c(509): error: expected an expression
    for (int i = 0 ; i < params->triangle_count ; i++ ) {
         ^

membrane.c(509): error: identifier "i" is undefined
    for (int i = 0 ; i < params->triangle_count ; i++ ) {
                     ^

membrane.c(522): error: expected an expression
      for (int d = 0 ; d < 3 ; d++ ) {
           ^

membrane.c(522): error: identifier "d" is undefined
      for (int d = 0 ; d < 3 ; d++ ) {
                       ^

membrane.c(530): error: expected an expression
      for (int d = 0 ; d < 3 ; d++ ) {
           ^

membrane.c(509): error: OpenMP for-init does not conform
    for (int i = 0 ; i < params->triangle_count ; i++ ) {
    ^

membrane.c(509): error: OpenMP for-test does not conform
    for (int i = 0 ; i < params->triangle_count ; i++ ) {
    ^

membrane.c(509): error: OpenMP for-incr does not conform
    for (int i = 0 ; i < params->triangle_count ; i++ ) {
    ^
[/bash]

TimP
Black Belt
95 Views

Jim changed the code to take advantage of C99 features, so you'll need to use -std=c99
italo1337
Beginner
95 Views

With/Qstd=c99 it compiles fine, but I still get the same results. It works on version 11.1 but not on 12.

The result is as if this function is not executed at all. Compiling it on 11.1 without this function yields the same (or very similar) results as compiling it on 12 with the function.

Edit: I just ran a test. I saved the vector "z" before and after calling the function "calc_membrane". On version 12 it is not being modified. As I said, it's like the function is not executed at all.

italo1337
Beginner
95 Views

I made a test case. See atached file.

It loads some data and runs my function. The contents of the vector z are saved before and after calling the function. Without openmp it works fine, but with it the vector remains unchanged.

There is also a batch file with the command line I am using.
Om_S_Intel
Employee
95 Views

I get compilation error with Intel compiler 12.0 when compiling attached testcase.

c:\>icl test.c /fast /Qopenmp /fp:precise /fp:double

Intel C++ Intel 64 Compiler XE for applications running on Intel 64, Version 12.0.1.127 Build 20101116

Copyright (C) 1985-2010 Intel Corporation. All rights reserved.

test.c

ipo: remark #11001: performing single-file optimizations

ipo: remark #11006: generating object file C:\Users\opsachan\AppData\Local\Temp\

ipo_772.obj

(0): internal error: backend signals

icl: error #10014: problem during multi-file optimization compilation (code 4)

italo1337
Beginner
95 Views

It's not a problem with my code. It's an internal error. Seems to be a bug in the compiler.

I see you are using version12.0.1.127. Try using the latest one: 12.0.2.154.

Om_S_Intel
Employee
95 Views

Thanks for the clarification. I moved to 12.0.2.154. I can compile and run the testcase with OpenMP and without it. But I did not get any visual clue on differences. Could you please let me know what value is going wrong with OpenMP?

Om
italo1337
Beginner
95 Views

Two files are created when you run the program.bug_00000.txt is all zeros as it should be. bug_00001.txt should have many non-zero values, but it is also all zeros when you use OpenMP. It shows that the vector is not being modified when OpenMP is used.

BTW, I am using Windows 7 x64 on a Core i7 930.

Om_S_Intel
Employee
95 Views

The private variables are constucted using default constructor by each thread.The array dW[3][3] in private list seems to be incorrect.

You may changed your codeas given below and try.

// double dW[3][3];

//#pragma omp parallel for private(i, d, e, dW)

#pragma omp parallel for private(i, d, e)

for ( i = 0 ; i < params->triangle_count ; i++ ) {

double dW[3][3];

italo1337
Beginner
95 Views

I tried your code. Same problem as before. "z" remains unchanged when I use OpenMP.

Once again, this code:

- works on version 11.1 with or without OpenMP

- works on GCC with or without OpenMP.

- works on version 12 without OpenMP.

- fails on version 12 with OpenMP.

And remember that it caused an "internal error" on an earlier release of the v12 compiler.

italo1337
Beginner
95 Views

Small update:

- I tried the code with Microsoft Visual Studio 9.0. It works fine.

- With Intel v11.1 on Linux also works fine.

- With Intel v12 on Linux I get the same problem.

italo1337
Beginner
95 Views

Bug is still present in version 12.0.4.196.

I modified the test case (attached). It now shows on screen the differences between running the same code with and without openmp.

If I compile with version 11.1.070 I get:

calculating with openmp
zeroes: 57 non_zeroes: 1869
calculating without openmp
zeroes: 57 non_zeroes: 1869

Which is correct. With version 12.0.4.196 I get:

calculating with openmp
zeroes: 1926 non_zeroes: 0
calculating without openmp
zeroes: 57 non_zeroes: 1869

Which is wrong. Notice that with openmp the vector does not get updated at all. It remains all zero.

Any updates on the status of this issue?
mecej4
Black Belt
95 Views

I tried the code in bug+test+new.zip using Microsoft C (16.00.30319.01 for x64) and found that different optimization levels gave different results.

This could indicate that there are problems with the code or with its usage of OpenMP directives.
italo1337
Beginner
95 Views

When dealing with floating point arithmetics, it is normal to get slightly different results when using different optimizations and OpenMP. If you just change the order of a sum you might get different results. But this is not the issue here. I am not getting different reults. I am getting all zeroes when using compiler v12. I tried this same code on GCC, Microsoft and Intel v11. They all give slightly different results, but not all zeroes.
mecej4
Black Belt
95 Views

Perhaps I was too brief in my previous post, not being aware of the significance of the counts in your application. However, this is what I found, which is different from what you stated about Microsoft C.
[bash]s:langItalo>cl /Od /openmp test.c /Fetest
Microsoft  C/C++ Optimizing Compiler Version 16.00.30319.01 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

test.c
Microsoft  Incremental Linker Version 10.00.30319.01
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:test.exe
test.obj

s:langItalo>.test
calculating with openmp
zeroes: 1926     non_zeroes: 0
calculating without openmp
zeroes: 1926     non_zeroes: 0
[/bash]
italo1337
Beginner
95 Views

Strange. Are you loading the data file?

So, if Microsoft's compiler is correct, then Intel v11, with ou without OpenMP, and v12 without OpenMP are wrong?