- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ifort 19 and 18 produces wrong results for the following code when SSE4.2 vectorization is enabled.
When compiled with AVX it produces the correct results
Previous versions of the compiler seem to produce the correct result for this code and SSE4.2
See below the outputs when compiled with the different options. The executable runs on a Intel Xeon CPU E3-1240 v3 @ 3.40GHz
corr.f
(removed by customer request - @gn164 , let me know if this is what you wanted. thanks! Mary T. intel.community@intel.com)
main.f
(removed by customer request)
$INTEL16_HOME/bin/ifort -O3 -xSSE4.2 -o sseTest main.f corr.f
67.00000 67.00000 67.00000 67.00000 67.00000
67.00000 67.00000 67.00000 67.00000 67.00000
67.00000
$INTEL19_HOME/bin/ifort -O3 -xSSE4.2 -o sseTest main.f corr.f
67.00000 67.00000 67.00000 67.00000 67.00000
67.00000 67.00000 59.00000 67.00000 59.00000
59.00000
$INTEL19_HOME/bin/ifort -O3 -xAVX -o sseTest main.f corr.f
(removed by customer request)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This bug is also seen with 18. 0.5.274 and 2021.1.1.216 on Windows, with /QxSSE4.2 /O3.
The bug is not seen with 16.0.8.254 on Windows, using the same options.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue occurs at O3 only. and SSE4.1 or 4.2 only. with 18.x, 19.0.x and 19.1.x compilers. I entered a bug report CMPLRIL0-32474
Another interesting tidbit - if you combine corr.f and main.f into 1 source file the error goes away!
We'll get working on a fix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ronald,
Thanks, the error also goes away if the array sizes and strides that are passed to the function are defined within the function scope.
I am assuming that the generated code is different if these are known to the vectorizer and that combining the functions in the same file would also make those visible if there is some interprocedural optimization by default within the translation unit.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page