Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29280 Discussions

a problem of nested OpenMP parallelism with workshare

yeg001
Beginner
477 Views
I encountered a "parallel workshare" problem.
When i am programing, i meet situation like the example as follow.
[cpp]subroutine test_nested_para(a)
!$ use omp_lib
    implicit none
    real :: a(2,2)
!$OMP PARALLEL WORKSHARE num_threads(2)
    a = 1.
!$OMP END PARALLEL WORKSHARE
end subroutine test_nested_para

program main
!$ use omp_lib
    implicit none
    real :: a(2,2)
    integer :: TID
!$OMP parallel num_threads(2) private(a)
    call test_nested_para(a)
!$OMP critical
    write(*, *) "Tid:", omp_get_thread_num()
    write(*, 100) a
100 format (2f8.3)
!$OMP end critical
!$OMP end parallel
    stop
end[/cpp]

and IVF11.0.081 for linux give me the answer is:
[cpp] Tid:           1
   0.000   0.000
   0.000   0.000
 Tid:           0
   1.000   1.000
   1.000   1.000[/cpp]

I am a learner of OMP, and i think the result should all be 1.000.
I modified the code a little:
[cpp]!$OMP PARALLEL num_threads(2)
!$OMP WORKSHARE
    a = 1.
!$OMP END WORKSHARE
!$OMP END PARALLEL[/cpp]
and get
[cpp] Tid:           0
   1.000   1.000
   1.000   1.000
 Tid:           1
   1.000   1.000
   1.000   1.000[/cpp]

It seem not concerned about the subroutine, but the nested combined structure "parallel workshare". Is it a bug?
0 Kudos
2 Replies
pbkenned1
Employee
477 Views
Quoting - yeg001
I encountered a "parallel workshare" problem.
When i am programing, i meet situation like the example as follow.
[cpp]subroutine test_nested_para(a)
!$ use omp_lib
    implicit none
    real :: a(2,2)
!$OMP PARALLEL WORKSHARE num_threads(2)
    a = 1.
!$OMP END PARALLEL WORKSHARE
end subroutine test_nested_para

program main
!$ use omp_lib
    implicit none
    real :: a(2,2)
    integer :: TID
!$OMP parallel num_threads(2) private(a)
    call test_nested_para(a)
!$OMP critical
    write(*, *) "Tid:", omp_get_thread_num()
    write(*, 100) a
100 format (2f8.3)
!$OMP end critical
!$OMP end parallel
    stop
end[/cpp]

and IVF11.0.081 for linux give me the answer is:
[cpp] Tid:           1
   0.000   0.000
   0.000   0.000
 Tid:           0
   1.000   1.000
   1.000   1.000[/cpp]

I am a learner of OMP, and i think the result should all be 1.000.
I modified the code a little:
[cpp]!$OMP PARALLEL num_threads(2)
!$OMP WORKSHARE
    a = 1.
!$OMP END WORKSHARE
!$OMP END PARALLEL[/cpp]
and get
[cpp] Tid:           0
   1.000   1.000
   1.000   1.000
 Tid:           1
   1.000   1.000
   1.000   1.000[/cpp]

It seem not concerned about the subroutine, but the nested combined structure "parallel workshare". Is it a bug?

It's a compiler defect with !$OMP PARALLEL WORKSHARE, since the combined form of the directive is semantically equivalent to the general form (which has the correct ouput):
!$OMP PARALLEL num_threads(2)
!$OMP WORKSHARE
... Some code
!$OMP END WORKSHARE
!$OMP END PARALLEL

I thought the bug might be due to the nested parallelism, but by default nested parallelism is not enabled, and enabling it makes no difference with the original test case (that uses the combined directive).

Further, if I remove the nesting, and just put the WORKSHARE directive in the dynamic OpenMP extent, the bug is also present:

$ cat foo-unnested.f90
subroutine test_nested_para(a)
use omp_lib
implicit none
real :: a(2,2)
!$OMP WORKSHARE
a = 1.
!$OMP END WORKSHARE
end subroutine test_nested_para

program main
use omp_lib
implicit none
real :: a(2,2)
integer :: TID
!$OMP parallel num_threads(2) private(a)
call test_nested_para(a)
!$OMP critical
write(*, *) "Tid:", omp_get_thread_num()
write(*, 100) a
100 format (2f8.3)
!$OMP end critical
!$OMP end parallel
stop
end
$ ifort -V
Intel Fortran Intel 64 Compiler for applications running on Intel 64, Version Mainline Beta Build20090414

$ ifort -openmp foo-unnested.f90 && ./a.out
Tid: 0
1.000 1.000
1.000 1.000
Tid: 1
0.000 0.000
0.000 0.000
$

Normally I would ask you to file a bug report using Intel Premier, but I'll do that for this case.

Thanks for reporting this defect!

Patrick Kennedy
Intel Compiler Lab
0 Kudos
TimP
Honored Contributor III
477 Views
As WORKSHARE is not fully implemented in current compilers, there seems limited benefit for trying to fix this. Another thread mentioned WORKSHARE is under consideration for implementation as more than a SINGLE region in a future compiler.
In case if's of any interest, the beginning of support for WORKSHARE as more than a SINGLE region went into gfortran 4.5 yesterday, so the situation with released gfortran is similar (no useful support for WORKSHARE).
0 Kudos
Reply