- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
we have found an issue with ICC version 12.x regarding OpenMP parallel for with collapse clause. When compiling with -O2 or -O3 the results are not those we are expecting. However, the code works fine with GCC -O3, and also ICC with -O0. On the other hand, we have found that hard coding the limits of the outer loop with ICC -O3 (instead of setting the limits on variables) drives us to a segmentation fault. We don't know where to report this bug with the compiler. We have already found another thread reporting a similar problem with fortran compiler (http://software.intel.com/en-us/forums/showthread.php?t=83135&wapkw=unexpected+behavior+for+openmp+collapse+clause).
regards
we have found an issue with ICC version 12.x regarding OpenMP parallel for with collapse clause. When compiling with -O2 or -O3 the results are not those we are expecting. However, the code works fine with GCC -O3, and also ICC with -O0. On the other hand, we have found that hard coding the limits of the outer loop with ICC -O3 (instead of setting the limits on variables) drives us to a segmentation fault. We don't know where to report this bug with the compiler. We have already found another thread reporting a similar problem with fortran compiler (http://software.intel.com/en-us/forums/showthread.php?t=83135&wapkw=unexpected+behavior+for+openmp+collapse+clause).
regards
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For a segmentation fault, you should check your user stack limit setting, and, for a large case, your thread stack limit (default 2MB for 32-bit, 4MB for 64-bit, adjustable according to KMP_STACKSIZE).
The best way to report this is by filing a problem report with (if possible) a small reproducer on premier.intel.com. Registering your license automatically creates your support account. If you didn't register it, you can do so at https://registrationcenter.intel.com.
I have a stale bug report in about failure of collapse. Recently, one of my customers had some success with collapse, even in a case where the outer loop count isn't fixed at compile time and there is a vectorizable dot product inside the 2 outer collapsed loops.
The best way to report this is by filing a problem report with (if possible) a small reproducer on premier.intel.com. Registering your license automatically creates your support account. If you didn't register it, you can do so at https://registrationcenter.intel.com.
I have a stale bug report in about failure of collapse. Recently, one of my customers had some success with collapse, even in a case where the outer loop count isn't fixed at compile time and there is a vectorizable dot product inside the 2 outer collapsed loops.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Yes it would be better if you could provide us some testcase which can emulate the error/issue. Also could you please check if using "-no-vec" can solve the problem with using O2 or O3 with icc.
ex:-icc -O2 simple.c -no-vec
Thanks & Regards,
Sukruth H.V
Yes it would be better if you could provide us some testcase which can emulate the error/issue. Also could you please check if using "-no-vec" can solve the problem with using O2 or O3 with icc.
ex:-icc -O2 simple.c -no-vec
Thanks & Regards,
Sukruth H.V
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There may be an implicit assumption here that your case auto-vectorizes with icc but that you didn't attempt vectorization with gcc.
In the case of 2 nested loops with the inner one vectorizable, usefulness of collapse might be unusual, and the compiler might encounter difficulty.
Needless to say, it's important for collapse to work in the case of 2 outer collapsible loops and a 3rd inner vectorized loop. There is a possibility of difficulty with a total of 3 nested loops, if the compiler attempts unroll-and-jam on the 2 inner loops when you request collapse on the 2 outer loops. I don't expect icc to attempt that, but I don't know how to control it.
In the case of 2 nested loops with the inner one vectorizable, usefulness of collapse might be unusual, and the compiler might encounter difficulty.
Needless to say, it's important for collapse to work in the case of 2 outer collapsible loops and a 3rd inner vectorized loop. There is a possibility of difficulty with a total of 3 nested loops, if the compiler attempts unroll-and-jam on the 2 inner loops when you request collapse on the 2 outer loops. I don't expect icc to attempt that, but I don't know how to control it.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page