Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
104 Views

Compiler doesn't serialize calls in OpenMP SIMD loop properly in some (Windows / complex) cases

Description:

When calling a function from a `<complex>` library in a SIMD loop, the Intel Compiler uses implementations provided by Microsoft. These are not SIMD declared functions and from the optimization report it can be seen that some of the function calls are serialized.

The issue is even weirder, it occurs only when the `std::complex` object is constructed right before calling a function on it. See the attached minimal example. In the example. I use `std::exp(const std::complex&)` function and this was originally observed when calculating something like `std::exp(std::complex<float>{0.0, array})` in a SIMD loop. I tried to remove the function call completely and the bug disappears, then I tried a different function (`std::sin(const std::complex&)`) and the bug occurred again.

The bug is Windows-specific and I wasn't able to reproduce it on GNU/Linux.

Compilation of the attached example:

icl /Qx:AVX /Qopenmp-simd /Qopt-report=3 exptest.cpp -o exptest.exe

Example output:

Copy + exp error:       0
SIMD exp error:         3.81745e-16
SIMD copy + exp error:  1.99676
Expanded formula error: 6.14074e-16

Environment:

Intel C++ Compiler Version 19.1.0.166 (but you can obtain the same results with 19.0 versions of the compiler)
Visual Studio 2017 Version 15.9.15 (Visual Studio 2019 was also tested at some point but I cannot remember the results)

 

0 Kudos
6 Replies
Highlighted
Moderator
104 Views

Hi,

We are also getting the same output in Windows 10. We are checking with the concerned team and will get back to you.

0 Kudos
Highlighted
Moderator
104 Views

Hi Jakub, What are the expected outputs on your test case?

0 Kudos
Highlighted
Beginner
104 Views

Viet Hoang (Intel) wrote:

Hi Jakub, What are the expected outputs on your test case?

Hi,

all the functions in my example try to calculate the same thing (and two of them differ only by the pragmas used). Therefore I'm expecting errors around the precision of the used datatype (due to non-associativity / algorithm differences, 10^-15 should be acceptable for double precision numbers).

0 Kudos
Highlighted
Moderator
104 Views

Before digging into it further, below are results of icl vs. cl.

 

C:\Temp>icl  /Qopenmp-simd exptest.cpp -o exptest.exe
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.0.166 Build 20191121
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.

exptest.cpp
Microsoft (R) Incremental Linker Version 14.16.27032.1
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:exptest.exe
exptest.obj

C:\Temp>exptest.exe
Copy + exp error:       0
SIMD exp error:         3.52271e-16
SIMD copy + exp error:  1.99703
Expanded formula error: 5.56892e-16

C:\Temp>cl  /openmp-simd exptest.cpp -o exptest.exe
Microsoft (R) C/C++ Optimizing Compiler Version 19.16.27032.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

cl : Command line warning D9035 : option 'o' has been deprecated and will be removed in a future release
cl : Command line warning D9035 : option 'o' has been deprecated and will be removed in a future release
exptest.cpp
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.16.27023\include\xlocale(319): warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc
exptest.cpp(30): warning C4068: unknown pragma
exptest.cpp(43): warning C4068: unknown pragma
Microsoft (R) Incremental Linker Version 14.16.27032.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:exptest.exe
/out:penmp-simd.exe
/out:exptest.exe
exptest.obj

C:\Temp>exptest.exe
Copy + exp error:       0
SIMD exp error:         0
SIMD copy + exp error:  0
Expanded formula error: 4.96896e-11

C:\Temp>

Do you think cl gives the expected results?

0 Kudos
Highlighted
Beginner
104 Views

Not sure if the comparison is fair. I suspect that the Microsoft compiler didn't use OpenMP at all in that case. Zero relative error means that the operations were not reordered at all, and there is no `/openmp-simd` compiler flag (at least documented). Moreover, OpenMP support in for MSVC is quite limited (2.0), and according to the documentation SIMD directives are only available with Visual Studio 2019 toolchain while using `/openmp:experimental`compiler flag.

But otherwise zero relative error means it is at least working (giving proper results).

I get the following on GNU/Linux with GCC 9.2:

g++ exptest.cpp -fopenmp -O3 -fopt-info-vec -march=core-avx2 -o exptest

Copy + exp error:       0
SIMD exp error:         0
SIMD copy + exp error:  0
Expanded formula error: 1.60191e-16

In this case I was unable to persuade GCC to not to vectorize some of the loops, so either all or none of them are vectorized (yielding the same code) and thus the errors are zero. Not a fair comparison either.

With GNU/Linux, Intel C++ Compiler 19.0.3.199 I get the following:

icpc exptest.cpp -qopenmp-simd -O3 -march=core-avx2 -o exptest

Copy + exp error:       0
SIMD exp error:         2.20764e-16
SIMD copy + exp error:  2.20764e-16
Expanded formula error: 5.02641e-16

Which is exactly what I'd expect.

0 Kudos
Highlighted
Moderator
104 Views

I've reported this case to the Developer (CMPLRIL0-32620). Thanks.

0 Kudos