Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

OpenMp turncation error

chat1983
Beginner
504 Views
OpenMp turncation error

Hi,
Im quite new to openMP and I found that the results of the serial and parallel version of the following code has a difference at the last decimalplace.

I use intel fortran compiler in an intel 64 bit machine under fedora 10.

Can anybody pls tell me a way how can this be elminated.

----------------------------------------------------------------------------
program main
implicit none
integer i,j,l
double precision a(9999)

c initialisation
do i=1,9999
a(i)=i**(.2)
enddo
----------------------------------------------------------
!$omp parallel do default(shared) private(i,j)
do j=1,20
do i=1,9999
a(i)=a(i)+i/3.d0
enddo
enddo
!$omp end parallel do
-----------------------------------------------------------

open(file='pxx',unit=30)
do i=1,9999
write(30,*) i,a(i)
enddo

end
-----------------------------------------------------------


here is some sample result comparison.
9959c9959
< 9959 66399.6377247174
---
> 9959 66399.6377247175
9973c9973
< 9973 66492.9728295009
---
> 9973 66492.9728295008
9980c9980
< 9980 66539.6403811772
---
> 9980 66539.6403811773
9985c9985
< 9985 66572.9743463199
---
> 9985 66572.9743463198

0 Kudos
1 Solution
jonathandursi
Novice
504 Views
If I run this, I can get answers which differ by much more than the last few places! For instance,

$ ifort -o omp omp.f90 -openmp
$ ./omp
$ mv pxx pxx.1
$ ./omp
$ mv pxx pxx.2
$ diff pxx.1 pxx.2
1c1
< 1 7.66666666666666
---
> 1 7.00000000000000
3c3
< 3 21.2457309961319
---
> 3 18.2457309961319
45,47c45,47
< 45 302.141127347946
< 46 308.817226807276
< 47 315.493163426717
---
> 45 272.141127347946
> 46 278.150560140610
> 47 268.493163426717
...

The problem here is in how you're updating the array a:
!$omp parallel do default(shared) private(i,j)
do j=1,20
do i=1,9999
a(i)=a(i)+i/3.d0
enddo
enddo
!$omp end parallel do

The loop over j is being broken up into different pieces and assigned to different threads. However, each thread then executes the 'i' loop. This means that potentially many different threads are updating the same a(i) at the same time - a classic example of "a race condition", one of the most common sorts of parallel programming error.

In this particular example, there is a very simple solution; just by flipping the order of the loops, we avoid this:

!$omp parallel do default(shared) private(i,j)
do i=1,9999
do j=1,20
a(i)=a(i)+i/3.d0
enddo
enddo
!$omp end parallel do

In this case, the loop over *i* is split up over threads, so now the different threads are working on independant pieces of a - so there is no longer a race condition, and the results end up being consistent:

$ ifort -o omp omp.f90 -openmp
$ ./omp
$ mv pxx pxx.1
$ ./omp
$ mv pxx pxx.2
$ diff pxx.1 pxx.2
$

Note too that running a case two times and getting the same answer both times doesn't prove there's no race conditions - some can be very subtle and only show up one time in a million. Intel has development tools available which can help you analyze your code and look for such issues. Any time, as here, you're modifying a shared data structure like the array a, you must be quite sure that there is no way that multiple threads will be modifying the same piece at the same time. (For just this reason, in fact, I prefer that my students don't use the default(shared) construct at all - default(none), much like implicit none in Fortran, is clearly the way to go, as one must make explicit declarations about how each structure is being used. Here, listing shared(a) would make very clear the issues...)



View solution in original post

0 Kudos
2 Replies
jonathandursi
Novice
505 Views
If I run this, I can get answers which differ by much more than the last few places! For instance,

$ ifort -o omp omp.f90 -openmp
$ ./omp
$ mv pxx pxx.1
$ ./omp
$ mv pxx pxx.2
$ diff pxx.1 pxx.2
1c1
< 1 7.66666666666666
---
> 1 7.00000000000000
3c3
< 3 21.2457309961319
---
> 3 18.2457309961319
45,47c45,47
< 45 302.141127347946
< 46 308.817226807276
< 47 315.493163426717
---
> 45 272.141127347946
> 46 278.150560140610
> 47 268.493163426717
...

The problem here is in how you're updating the array a:
!$omp parallel do default(shared) private(i,j)
do j=1,20
do i=1,9999
a(i)=a(i)+i/3.d0
enddo
enddo
!$omp end parallel do

The loop over j is being broken up into different pieces and assigned to different threads. However, each thread then executes the 'i' loop. This means that potentially many different threads are updating the same a(i) at the same time - a classic example of "a race condition", one of the most common sorts of parallel programming error.

In this particular example, there is a very simple solution; just by flipping the order of the loops, we avoid this:

!$omp parallel do default(shared) private(i,j)
do i=1,9999
do j=1,20
a(i)=a(i)+i/3.d0
enddo
enddo
!$omp end parallel do

In this case, the loop over *i* is split up over threads, so now the different threads are working on independant pieces of a - so there is no longer a race condition, and the results end up being consistent:

$ ifort -o omp omp.f90 -openmp
$ ./omp
$ mv pxx pxx.1
$ ./omp
$ mv pxx pxx.2
$ diff pxx.1 pxx.2
$

Note too that running a case two times and getting the same answer both times doesn't prove there's no race conditions - some can be very subtle and only show up one time in a million. Intel has development tools available which can help you analyze your code and look for such issues. Any time, as here, you're modifying a shared data structure like the array a, you must be quite sure that there is no way that multiple threads will be modifying the same piece at the same time. (For just this reason, in fact, I prefer that my students don't use the default(shared) construct at all - default(none), much like implicit none in Fortran, is clearly the way to go, as one must make explicit declarations about how each structure is being used. Here, listing shared(a) would make very clear the issues...)



0 Kudos
chat1983
Beginner
504 Views
Than you very much for your comprehensive answer.I could correct the error thanks to you. also many thanks for your advices as well.
0 Kudos
Reply