Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29277 Discussions

Use of IEEE exception flags in OpenMP parallel region

eos_pengwern
Beginner
1,341 Views

I'd like to use the following code:

[fortran]!$OMP PARALLEL DEFAULT(SHARED) PRIVATE(i,j, exception_found) !$OMP DO do j=1,m ! There is a risk that rounding errors may lead to an occasional negative ! value for the expression inside the square root; we look for the IEEE ! invalid flag to intercept and correct this case on the relatively rare ! occasions when it occurs: call ieee_set_flag(ieee_invalid, .false.) do i=1,n cos_phi(i,j,k) = sqrt(1d0 - sin_phi(i,j,k)**2) end do call ieee_get_flag(ieee_invalid, exception_found) if (exception_found) then do i=1,n if (sin_phi(i,j,k).gt.1d0) cos_phi(i,j,k) = 0d0 end do end if end do !$OMP END DO !$OMP END PARALLEL[/fortran]

However, I'm unsure of exactly how the IEEE exception flags are implemented: is there one set of flags per thread, or just one for the whole processor? If the latter, then an exception occurring in one thread will trigger unnecessary actions in all the other threads, which could quickly become very inefficient.

Does anyone know?

0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
1,341 Views
>>I meant one matrix element in 10,000, rather than one run through the code

That wasn't clear in the comments (at least when I first read it).

>>so in a typical case where n x m >> 10,000

Then the probability of error is ~1.0
When an error occurs, one of the threads will run through the corrector loop, which is only correcting cos_phi, it should also correct sin_phi. The thread(s) that run the repair loop will also delay the completion of the parallel region

Therefore the likely candidate for best performance is to assure sin_phi(i,j,k) is .le. 1.0d0 and .ge. -1.0d0.

I cannot guess as to if using a seperate loop conditioning sin_phi would be faster than incorporating MAX in the cos_phi calculation. However, when you generate sin_phi(i,j,k) if you do this conditioning, you won't need it in the cos_phi calculation loop (but you may also need to condition cos_phi to .le. 1.0d0 and .ge. -1.0d0.

GI-GO (Garbage In, Garbage Out)

When you create these tables for use by your application, it is your responsibility to assure that values are correct enough to no propagate unsuspecting errors later in your code.

Jim Dempsey

View solution in original post

0 Kudos
9 Replies
jimdempseyatthecove
Honored Contributor III
1,341 Views
Wouldn't it be better for your program to assure that sin_phi(i,j,k) is .le. 1.0d0?
IOW catch the error when producing sin_phi(i,j,k) as opposed to each place it is used?

Jim Dempsey
0 Kudos
eos_pengwern
Beginner
1,341 Views
The thing is, in my application, rounding errors where sin_phi(i,j,k).gt.1d0 are very rare - maybe one in 10,000 or less - and are of no consequence in any context other than the calculation of cos_phi. Checking every sin_phi value at source would add a fair bit of overhead, whilst by looking for IEEE exceptions I can (in theory)reduce the overhead by only handling the cases where there is actually a problem.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,341 Views
Your exception handler (code you provide and hooked using IEEE_HANDLER) can save the exception codes into Thread Local Storage (as opposed to, or in addition to, in global memory). If you do this then you can have a thread-by-thread detection of floating point errors.

Jim Dempsey
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,341 Views
Consider:

[fortran]do ! There is a risk that rounding errors may lead to an occasional negative ! value for the expression inside the square root; we look for the IEEE ! invalid flag to intercept and correct this case on the relatively rare ! (less than 1 i 10,000) occasions when it occurs: call ieee_set_flag(ieee_invalid, .false.) !$OMP PARALLEL DEFAULT(SHARED) PRIVATE(i,j) !$OMP DO do j=1,m do i=1,n cos_phi(i,j,k) = sqrt(1d0 - sin_phi(i,j,k)**2) end do end do !$OMP END DO !$OMP END PARALLEL call ieee_get_flag(ieee_invalid, exception_found) if(.not. exception_found) exit !$OMP PARALLEL DEFAULT(SHARED) PRIVATE(i,j) !$OMP DO do j=1,m do i=1,n if (sin_phi(i,j,k).gt.1d0) sin_phi(i,j,k) = 1.0d0 end do end do !$OMP END DO !$OMP END PARALLEL end do[/fortran] Jim Dempsey
0 Kudos
eos_pengwern
Beginner
1,341 Views
Thank you Jim.

With respect to your mail at 25 minutes past the hour, isn't this what is accomplished by declaring 'exception_found' as a private variable in my original example?

The code in your mail at 37 minutes past the hour wouldn't work so well; when I said '1 time in 10,000' I meant one matrix element in 10,000, rather than one run through the code, so in a typical case where n x m >> 10,000 I'd just be running the second part of the code every single time. What I'd like to do is only to run through the i-loop once when a problem is found.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,342 Views
>>I meant one matrix element in 10,000, rather than one run through the code

That wasn't clear in the comments (at least when I first read it).

>>so in a typical case where n x m >> 10,000

Then the probability of error is ~1.0
When an error occurs, one of the threads will run through the corrector loop, which is only correcting cos_phi, it should also correct sin_phi. The thread(s) that run the repair loop will also delay the completion of the parallel region

Therefore the likely candidate for best performance is to assure sin_phi(i,j,k) is .le. 1.0d0 and .ge. -1.0d0.

I cannot guess as to if using a seperate loop conditioning sin_phi would be faster than incorporating MAX in the cos_phi calculation. However, when you generate sin_phi(i,j,k) if you do this conditioning, you won't need it in the cos_phi calculation loop (but you may also need to condition cos_phi to .le. 1.0d0 and .ge. -1.0d0.

GI-GO (Garbage In, Garbage Out)

When you create these tables for use by your application, it is your responsibility to assure that values are correct enough to no propagate unsuspecting errors later in your code.

Jim Dempsey
0 Kudos
eos_pengwern
Beginner
1,341 Views
Thanks again Jim.

I'm probably just trying to be too clever here, by trying to use IEEE exception handling when it's not necessarily the appropriate tool for the job.

For future reference, I'd still be curious to know how the exception handling is implemented when used in conjunction with OpenMP. I haven't been able to find any documentation on this, anywhere.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,341 Views
I covered this earlier.

You use the IEEE_HANDLER function to establish (set) an (your) external signal handling routine. I believe the handler address is global (but can easily be tested to verify this assumption). Therefor, any thread causing an exception will cause this subroutine to be called in the context of the thread causing the exception. You can put any code you want into this subroutine including writing the error conditions into Thread Local Storage. (You also need to write the updates to the global variables in a thread-safe manner. **) Your error detection routine, as used by each thread, can observe the error conditions as set in Thread Local Storage (as opposed to global error flags).

Your handler can store the error flags in both thread local and application global locations. In loops where you want all threads to bail-out you monitor the global flags, in loops such as your cos_phi you might want to monitor only thread local flags - your choice.

*** Note, the IEEE_HANDLER example shows CALL ABORT. You need not ABORT. When the error is not of abort type (e.g. not IEE_DIVIED_BY_ZERO). When not of abort type. you can set your indicator flag(s), using an atomic IOR for global or normal IOR for Thread Local Storage, then RETURN.

Jim Dempsey
0 Kudos
eos_pengwern
Beginner
1,341 Views
Aha, I see, I'd missed the significance of defining the IEEE_handler routine. Thank you.
0 Kudos
Reply