- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have been able to reproduce the issue in this small example:
[cpp]#include#include #include #define USE_CRITICAL 0 int doSomething(int i, int j) { if (i % 4 == 0 && j % 2 == 0) { throw std::runtime_error("Error"); } return i + j; } int main() { int itCount = 500; int total = 0; #pragma omp parallel { int myTotal = 0; #pragma omp for for (int i = 0; i < itCount; i++) { bool success = false; int j = 0; while (!success) { try { if (i >= itCount) { #if USE_CRITICAL #pragma omp critical #endif { std::cout << "Something wrong: thread " << omp_get_thread_num(); std::cout << " running iteration " << i << std::endl; } } myTotal += doSomething(i, j); success = true; ++j; } catch (std::exception &) { ++j; } } } #pragma omp atomic total += myTotal; } std::cout << "Total = " << total << std::endl; return 0; }[/cpp]
As you can see, exceptions are raised and caught within the worksharing construct, which should be fine according to OpenMP specifications.However, the code does not work as expected. In fact, I face twodifferent problemsdepending on the value of USE_CRITICAL. If USE_CRITICAL == 0, the threads get stuck in an endless loop. If USE_CRITICAL == 1, the process crashes.
Iattach the Visual Studio 2010 solution and the Linux makefile.
Regards,
Andrea
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was able to reproduce this issue and we are currently working with the Development Engineers to get it checked out. Will let you know if I can come up with some workaround for this issue. Thanks for bringing it to our notice.
Anoop
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As an experiment try
[cpp]#if USE_CRITICAL if (i >= itCount) { #pragma omp critical
{
std::cout << "Something wrong: thread " << omp_get_thread_num(); std::cout << " running iteration " << i << std::endl;
} } #else if (i >= itCount) { std::cout << "Something wrong: thread " << omp_get_thread_num(); std::cout << " running iteration " << i << std::endl; } #endif [/cpp]
If this corrects the problem it will give Intel support a starting point for looking for the bug.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Jim, your code does not correct the problem: still crashing if USE_CRITICAL == 1, stuck in an endless loop otherwise.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ahh...
This is indicative that the catch circumvented the barrier.
See if you can place the try/catch inside a scope that is not the scope of the statement of the #pragma omp parallel... You may need to insert something innocuous such the the compiler optimizations does not remove what appears to be dead code
#pragma omp parallel for
for(i=0; i .lt. n; ++i)
{ // parallel for scope
int dummy = 0;
{ // unnecessary scope
try {
...
} catch {
...
++dummy;
} // end catch
} // end unnecessary scope
if(dummy .lt. ) CanntHappen(); // unless you have .gt. 2g threads with errors
} // end for
You can fix the syntax. What you are experiencing appears to be a compiler error. The above may be a work around.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To me, it looks like executing the catch block makes conditional jump of the for loop always not-taken. The issue is likely be a bit more complicated due to the fact that introducing the critical region makes the application crash. Maybe the stack gets corrupt while entering or exiting the catch block.
After figuring out it was a compiler issue, Iused a different workaround what you proposed, i.e. I just avoided using the worksharing construct and split the iteration count myself.This code works as expected,but is not suitable if you need dynamic scheduling.
[cpp]#pragma omp parallel { int id = omp_get_thread_num(); int threadIt = itCount / omp_get_num_threads(); int extraIt = itCount % omp_get_num_threads(); int myIt = threadIt + (id < extraIt ? 1 : 0); int first = id * threadItCont + std::min(id, extraIt); // NO WORKSHARING CONSTRUCT HERE for (int i = first; i < first + myIt; ++i) { ... while(!...) { try {
... } catch (std::exception &) { ... } } } }[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp]templatestruct qtSlice { T iBegin; T iEnd; T jBegin; T jEnd; T kBegin; T kEnd; qtSlice(T iThread, T nThreads, T _iBegin, T _iEnd) { T iStride = (_iEnd - _iBegin + nThreads - 1) / nThreads; if(iStride == 0) iStride = 1; iBegin = iStride * iThread + _iBegin; iEnd = iBegin + iStride; if(iEnd > _iEnd) iEnd = _iEnd; jBegin = jEnd = kBegin = kEnd = 0; } qtSlice(T iThread, T nThreads, T _iBegin, T _iEnd, T _jBegin, T _jEnd) { if(nThreads > 2) { if(iThread < nThreads/2) { // take the first half qtSlice slice(0, 2, _iBegin, _iEnd, _jBegin, _jEnd); qtSlice sliceOfSlice(iThread, nThreads/2, slice.iBegin, slice.iEnd, slice.jBegin, slice.jEnd); iBegin = sliceOfSlice.iBegin; iEnd = sliceOfSlice.iEnd; jBegin = sliceOfSlice.jBegin; jEnd = sliceOfSlice.jEnd; return; } // take the second half qtSlice slice(1, 2, _iBegin, _iEnd, _jBegin, _jEnd); qtSlice sliceOfSlice(iThread - nThreads/2, nThreads - nThreads/2, slice.iBegin, slice.iEnd, slice.jBegin, slice.jEnd); iBegin = sliceOfSlice.iBegin; iEnd = sliceOfSlice.iEnd; jBegin = sliceOfSlice.jBegin; jEnd = sliceOfSlice.jEnd; return; } // if(nThreads > 2) iBegin = _iBegin; iEnd = _iEnd; jBegin = _jBegin; jEnd = _jEnd; if(nThreads == 1) return; T ni = iEnd - iBegin; // number of i T nj = jEnd - jBegin; // number of j T aij = ni * nj; // area of ij if(aij == 0) return; // empty area // try even split across one of the dimensions if(ni >= nj) { if((ni>=nThreads) && (ni%nThreads == 0)) { T si = ni/nThreads; iBegin = iBegin + si * iThread; iEnd = iBegin + si; if(iEnd > _iEnd) iEnd = _iEnd; return; } } if((nj>=nThreads) && (nj%nThreads == 0)) { T sj = nj/nThreads; jBegin = jBegin + sj * iThread; jEnd = jBegin + sj; if(jEnd > _jEnd) jEnd = _jEnd; return; } if(ni >= nj) { T si = (ni+nThreads-1)/nThreads; iBegin = iBegin + si * iThread; iEnd = iBegin + si; if(iEnd > _iEnd) iEnd = _iEnd; return; } T sj = (nj+nThreads-1)/nThreads; jBegin = jBegin + sj * iThread; jEnd = jBegin + sj; if(jEnd > _jEnd) jEnd = _jEnd; return; } }; [/cpp]
...
qtSlice mySlice(omp_get_thread_num(), omp_get_num_threads(), iBegin, iEnd);
for(int i=mySlice.iBegin; i.lt.mySlice.iEnd; ++i)
{ ...
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Andrea
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Development Engineers are working on the fix currently.The next updatefor Intel C++ Composer will belaunchedtomorrow and so the fix for this bug won't make it in this update. I will surely keep you postedonwhen you can expect the fix. Thanks for the followup.
Thanks and Regards
Anoop
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The fix is provided in Intel C++ Compiler 12.1 Update 7 (latest version). You can download the same from Intel registration center (https://registrationcenter.intel.com/regcenter/register.aspx). Please let us know if you have any issues.
Thanks and Regards
Anoop

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page