- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I tried using the reduction clause in a parallel region of an openmp enabled fortran code.
I get the following error even tough the variable_x is globally defined and is shared at the begining of the parallel region.
fortcom: Error: xxx.f, line 52: Variables that appear on the FIRSTPRIVATE, LASTPRIVATE, and REDUCTION clauses on a work-sharing directive must have shared scope in the enclosing region [variable_x].
Has anyone dealt with such problems in ifort 10?
Is it a complier bug?
Thanks,
Amit
I tried using the reduction clause in a parallel region of an openmp enabled fortran code.
I get the following error even tough the variable_x is globally defined and is shared at the begining of the parallel region.
fortcom: Error: xxx.f, line 52: Variables that appear on the FIRSTPRIVATE, LASTPRIVATE, and REDUCTION clauses on a work-sharing directive must have shared scope in the enclosing region [variable_x].
Has anyone dealt with such problems in ifort 10?
Is it a complier bug?
Thanks,
Amit
Link Copied
18 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amit
Hi,
I tried using the reduction clause in a parallel region of an openmp enabled fortran code.
I get the following error even tough the variable_x is globally defined and is shared at the begining of the parallel region.
fortcom: Error: xxx.f, line 52: Variables that appear on the FIRSTPRIVATE, LASTPRIVATE, and REDUCTION clauses on a work-sharing directive must have shared scope in the enclosing region [variable_x].
Has anyone dealt with such problems in ifort 10?
Is it a complier bug?
Thanks,
Amit
I tried using the reduction clause in a parallel region of an openmp enabled fortran code.
I get the following error even tough the variable_x is globally defined and is shared at the begining of the parallel region.
fortcom: Error: xxx.f, line 52: Variables that appear on the FIRSTPRIVATE, LASTPRIVATE, and REDUCTION clauses on a work-sharing directive must have shared scope in the enclosing region [variable_x].
Has anyone dealt with such problems in ifort 10?
Is it a complier bug?
Thanks,
Amit
Amit,
Can you supply a code snipit where it is reduced to where/how variable_x is declared and all the !$OMP statements (include the loop control statement on parallel DO). We won't need the computational part of your loop.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
Amit,
Can you supply a code snipit where it is reduced to where/how variable_x is declared and all the !$OMP statements (include the loop control statement on parallel DO). We won't need the computational part of your loop.
Jim Dempsey
Jim,
Here is a dummy code. I call this dummy subroutine frequently from the main code. The variable_x is summ_temp. The variable summ_temp is decalred in a module. This is the parallel region implementation of the code. The code is formatted in f77 and I use fortran compiler 10.0
the compiler flags are -c -r8 -openmp -O3
module PARALLEL_REGION
real summ_temp
end module PARALLEL_REGION
program main_prog
USE PARALLEL_REGION
c$omp parallel private(m,i,j,k) shared(summ_temp)
call sum(v,total)
c$omp end parallel
end program main_prog
subroutine sum(phi,summ)
USE PARALLEL_REGION
real,dimension(:,:,:,:),intent(in)::phi
real summ
integer m
c$omp critical
summ_temp = 0
c$omp end critical
c$omp do private(m,i,j,k) reduction(+:summ_temp)
do m = 1, m_blk(id)
do k = k_b(m), k_e(m)
do j = j_b(m), j_e(m)
do i = i_b(m), i_e(m)
summ_temp = summ_temp + phi(i,j,k,m)
enddo
enddo
enddo
enddo
c$omp enddo
end subroutine sum
Thanks,
A~
Amit,
Can you supply a code snipit where it is reduced to where/how variable_x is declared and all the !$OMP statements (include the loop control statement on parallel DO). We won't need the computational part of your loop.
Jim Dempsey
Jim,
Here is a dummy code. I call this dummy subroutine frequently from the main code. The variable_x is summ_temp. The variable summ_temp is decalred in a module. This is the parallel region implementation of the code. The code is formatted in f77 and I use fortran compiler 10.0
the compiler flags are -c -r8 -openmp -O3
module PARALLEL_REGION
real summ_temp
end module PARALLEL_REGION
program main_prog
USE PARALLEL_REGION
c$omp parallel private(m,i,j,k) shared(summ_temp)
call sum(v,total)
c$omp end parallel
end program main_prog
subroutine sum(phi,summ)
USE PARALLEL_REGION
real,dimension(:,:,:,:),intent(in)::phi
real summ
integer m
c$omp critical
summ_temp = 0
c$omp end critical
c$omp do private(m,i,j,k) reduction(+:summ_temp)
do m = 1, m_blk(id)
do k = k_b(m), k_e(m)
do j = j_b(m), j_e(m)
do i = i_b(m), i_e(m)
summ_temp = summ_temp + phi(i,j,k,m)
enddo
enddo
enddo
enddo
c$omp enddo
end subroutine sum
Thanks,
A~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amit
Quoting - jimdempseyatthecove
Jim,
Here is a dummy code. I call this dummy subroutine frequently from the main code. The variable_x is summ_temp. The variable summ_temp is decalred in a module. This is the parallel region implementation of the code. The code is formatted in f77 and I use fortran compiler 10.0
the compiler flags are -c -r8 -openmp -O3
module PARALLEL_REGION
real summ_temp
end module PARALLEL_REGION
program main_prog
USE PARALLEL_REGION
c$omp parallel private(m,i,j,k) shared(summ_temp)
call sum(v,total)
c$omp end parallel
end program main_prog
subroutine sum(phi,summ)
USE PARALLEL_REGION
real,dimension(:,:,:,:),intent(in)::phi
real summ
integer m
c$omp critical
summ_temp = 0
c$omp end critical
c$omp do private(m,i,j,k) reduction(+:summ_temp)
do m = 1, m_blk(m)
do k = k_b(m), k_e(m)
do j = j_b(m), j_e(m)
do i = i_b(m), i_e(m)
summ_temp = summ_temp + phi(i,j,k,m)
enddo
enddo
enddo
enddo
c$omp enddo
end subroutine sum
Thanks,
A~
Jim,
Here is a dummy code. I call this dummy subroutine frequently from the main code. The variable_x is summ_temp. The variable summ_temp is decalred in a module. This is the parallel region implementation of the code. The code is formatted in f77 and I use fortran compiler 10.0
the compiler flags are -c -r8 -openmp -O3
module PARALLEL_REGION
real summ_temp
end module PARALLEL_REGION
program main_prog
USE PARALLEL_REGION
c$omp parallel private(m,i,j,k) shared(summ_temp)
call sum(v,total)
c$omp end parallel
end program main_prog
subroutine sum(phi,summ)
USE PARALLEL_REGION
real,dimension(:,:,:,:),intent(in)::phi
real summ
integer m
c$omp critical
summ_temp = 0
c$omp end critical
c$omp do private(m,i,j,k) reduction(+:summ_temp)
do m = 1, m_blk(m)
do k = k_b(m), k_e(m)
do j = j_b(m), j_e(m)
do i = i_b(m), i_e(m)
summ_temp = summ_temp + phi(i,j,k,m)
enddo
enddo
enddo
enddo
c$omp enddo
end subroutine sum
Thanks,
A~
I question your use of the critical region which seems to duplicate functionality of sum reduction.
Did you mean to use nested parallel regions? It looks like you haven't carried the idea through, at least I'm not certain I see how you intended it to work.
Also, I suppose you could use a private sum variable for the inner 3 loops and add it into the sum reduction in the outer parallel loop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
I question your use of the critical region which seems to duplicate functionality of sum reduction.
Did you mean to use nested parallel regions? It looks like you haven't carried the idea through, at least I'm not certain I see how you intended it to work.
Also, I suppose you could use a private sum variable for the inner 3 loops and add it into the sum reduction in the outer parallel loop.
you are right. In this example case I don't need to use critical region.
I do not intent to use nested parallel regions.
The parallel directive is used in the main_prog which makes the subroutine call inside a parallel region and subsequently the do loop in the parallel region.
I just intend to calculate the sum of all the values of the 4 D array called 'phi' using the reduction clause across all the threads inside a parallel region using orphaned work sharing construct and ifort won't allow me to do that.
I did try a work around of updating a variable locally on each thread and then critically updating a global variable thus eliminating the reduction clause; this seems to work but slowly as it involves serial execution.
A~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try
[cpp]module PARALLEL_REGION real summ_temp end module PARALLEL_REGION program main_prog USE PARALLEL_REGION C *** drop shared as it is meaningless (variable in module) C *** drop private as m,i,j,k not in scope C *** if total in module no change, else add shared(total) C *** if v in module no change, else add shared(v) c$omp parallel call sum(v,total) c$omp end parallel end program main_prog subroutine sum(phi,summ) USE PARALLEL_REGION real,dimension(:,:,:,:),intent(in)::phi real summ C *** add other integers integer i,j,k,m C *** done as initializer of reduction $omp critical C *** done as initializer of reduction summ_temp = 0 C *** done as initializer of reduction c$omp end critical c$omp do private(m,i,j,k) reduction(+:summ_temp) do m = 1, m_blk(id) do k = k_b(m), k_e(m) do j = j_b(m), j_e(m) do i = i_b(m), i_e(m) summ_temp = summ_temp + phi(i,j,k,m) enddo enddo enddo enddo c$omp enddo end subroutine sum [/cpp]
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
Try
[cpp]module PARALLEL_REGION
real summ_temp
end module PARALLEL_REGION
program main_prog
USE PARALLEL_REGION
C *** drop shared as it is meaningless (variable in module)
C *** drop private as m,i,j,k not in scope
C *** if total in module no change, else add shared(total)
C *** if v in module no change, else add shared(v)
c$omp parallel
call sum(v,total)
c$omp end parallel
end program main_prog
subroutine sum(phi,summ)
USE PARALLEL_REGION
real,dimension(:,:,:,:),intent(in)::phi
real summ
C *** add other integers
integer i,j,k,m
C *** done as initializer of reduction $omp critical
C *** done as initializer of reduction summ_temp = 0
C *** done as initializer of reduction c$omp end critical
c$omp do private(m,i,j,k) reduction(+:summ_temp)
do m = 1, m_blk(id)
do k = k_b(m), k_e(m)
do j = j_b(m), j_e(m)
do i = i_b(m), i_e(m)
summ_temp = summ_temp + phi(i,j,k,m)
enddo
enddo
enddo
enddo
c$omp enddo
end subroutine sum
[/cpp]
Jim
Jim,
I have tried the modifications that you suggested but still I can't get the code to compile and get the same error...
I am not sure if I understand the OpenMP implemenation correctly or if there is something wrong with the implementation itself in ifort.
Amit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amit
I can't get the code to compile and get the same error...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[cpp] program main_prog c$omp parallel call sum(v,total) c$omp end parallel end program main_prog subroutine sum(phi,summ) real,dimension(:,:,:,:),intent(in)::phi real summ C *** add other integers integer i,j,k,m c$omp do private(m,i,j,k) reduction(+:summ) do m = 1, m_blk(id) do k = k_b(m), k_e(m) do j = j_b(m), j_e(m) do i = i_b(m), i_e(m) summ = summ + phi(i,j,k,m) enddo enddo enddo enddo c$omp enddo end subroutine sum [/cpp]
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
Is this fragment the entire code? OpenMP can't compensate for undefined variables.
Tim,
No, this dummy fragment is just a very small part of the code and I don't have any undefined variables. For that matter the code runs fine when, reduction clause is eliminated and replaced by serial execution but that slows the execution considerably.
Amit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
[cpp]
program main_prog
c$omp parallel
call sum(v,total)
c$omp end parallel
end program main_prog
subroutine sum(phi,summ)
real,dimension(:,:,:,:),intent(in)::phi
real summ
C *** add other integers
integer i,j,k,m
c$omp do private(m,i,j,k) reduction(+:summ)
do m = 1, m_blk(id)
do k = k_b(m), k_e(m)
do j = j_b(m), j_e(m)
do i = i_b(m), i_e(m)
summ = summ + phi(i,j,k,m)
enddo
enddo
enddo
enddo
c$omp enddo
end subroutine sum
[/cpp]
Jim
Jim,
Unfortunately, ifort does allow such compilation when the variable in the reduction clause 'summ' is locally defined, and it seems like incorrect implementation of OpenMP clause.Since in the parallel region the variable summ is local to each thread and there is no global memory location for the variable to get added across all the threads when using reduction clause.
Amit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amit
Jim,
Unfortunately, ifort does allow such compilation when the variable in the reduction clause 'summ' is locally defined, and it seems like incorrect implementation of OpenMP clause.Since in the parallel region the variable summ is local to each thread and there is no global memory location for the variable to get added across all the threads when using reduction clause.
Amit
summ is (was) the dummy argument in subroutine sum and which references total in the caller. total in the caller is either local variable, outside scope of c$omp parallel in progrma, or in module or common also outside scope of c$omp parallel.
Inside subroutine sum, the c$omp do should permit reduction clause on summ, making thread copy of summ on stack inside loop, then reducing to summ (total) on exit of loop.
I will test compile here in a few minutes.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
summ is (was) the dummy argument in subroutine sum and which references total in the caller. total in the caller is either local variable, outside scope of c$omp parallel in progrma, or in module or common also outside scope of c$omp parallel.
Inside subroutine sum, the c$omp do should permit reduction clause on summ, making thread copy of summ on stack inside loop, then reducing to summ (total) on exit of loop.
I will test compile here in a few minutes.
Jim
This compiles on my system
[cpp]module foo integer :: m_blk(100), k_b(100), j_b(100), i_b(100) integer :: k_e(100), j_e(100), i_e(100) end module foo program main_prog interface subroutine dosum(phi,summ) real,dimension(:,:,:,:),intent(in)::phi real summ end subroutine end interface real, dimension(10,10,10,10) :: v real :: total v = 123.456 !$omp parallel call dosum(v,total) !$omp end parallel end program main_prog subroutine dosum(phi,summ) use foo real,dimension(:,:,:,:),intent(in)::phi real summ ! *** add other integers integer i,j,k,m !$omp do private(m,i,j,k) reduction(+:summ) do m = 1, m_blk(id) do k = k_b(m), k_e(m) do j = j_b(m), j_e(m) do i = i_b(m), i_e(m) summ = summ + phi(i,j,k,m) enddo enddo enddo enddo !$omp enddo end subroutine dosum [/cpp]
note, I had to rename subroutine, since SUM is an intrinsic function.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
summ is (was) the dummy argument in subroutine sum and which references total in the caller. total in the caller is either local variable, outside scope of c$omp parallel in progrma, or in module or common also outside scope of c$omp parallel.
Inside subroutine sum, the c$omp do should permit reduction clause on summ, making thread copy of summ on stack inside loop, then reducing to summ (total) on exit of loop.
I will test compile here in a few minutes.
Jim
Sorry, I didn't mention this before as I just realized it, the problem with the implementation you showed is that, in the actual code, different variables are passed onto the subroutine thus the scope of variables that are passed to the subrouinte inside the parallel region varies between shared and private and is not fixed.
This is why in my code I am trying to introduce a differnet global variable.
Amit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amit
Sorry, I didn't mention this before as I just realized it, the problem with the implementation you showed is that, in the actual code, different variables are passed onto the subroutine thus the scope of variables that are passed to the subrouinte inside the parallel region varies between shared and private and is not fixed.
This is why in my code I am trying to introduce a differnet global variable.
Amit
I am not quite sure I understand.
Variation on prior post
[cpp]module foo type phiType real, dimension(10,10,10,10) :: v integer :: m_blk(100), k_b(100), j_b(100), i_b(100) integer :: k_e(100), j_e(100), i_e(100) end type phiType type(phiType) :: mod_phi end module foo program main_prog use foo interface subroutine dosum(phi, summ) use foo type(phiType) :: phi real summ end subroutine end interface type(phiType) :: local_phi real :: total mod_phi%v = 123.456 local_phi%v = 987.654 !$omp parallel call dosum(mod_phi,total) !$omp end parallel !$omp parallel call dosum(local_phi,total) !$omp end parallel end program main_prog subroutine dosum(phi,summ) use foo type(phiType) :: phi real summ ! *** add other integers integer i,j,k,m !$omp do private(m,i,j,k) reduction(+:summ) do m = 1, phi%m_blk(id) do k = phi%k_b(m), phi%k_e(m) do j = phi%j_b(m), phi%j_e(m) do i = phi%i_b(m), phi%i_e(m) summ = summ + phi%v(i,j,k,m) enddo enddo enddo enddo !$omp enddo end subroutine dosum [/cpp]
Note, The dimensions of the sample code above are not workable, I will let you fix that
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, you can seperate the values array v from the bounds
subroutinedosum(values, bounds, result)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
summ is (was) the dummy argument in subroutine sum and which references total in the caller. total in the caller is either local variable, outside scope of c$omp parallel in progrma, or in module or common also outside scope of c$omp parallel.
Inside subroutine sum, the c$omp do should permit reduction clause on summ, making thread copy of summ on stack inside loop, then reducing to summ (total) on exit of loop.
I will test compile here in a few minutes.
Jim
This example compiles for me too.
When I try to do the same in my code, the code compiles but when I run the code I get inconsistent results.
That is for every repeated run with all the parameters remaining same (nothing at all changes), the reduction clause outputs different results.
I am not sure why this is happening. Is it because I have ifort 10.0?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amit
Quoting - jimdempseyatthecove
summ is (was) the dummy argument in subroutine sum and which references total in the caller. total in the caller is either local variable, outside scope of c$omp parallel in progrma, or in module or common also outside scope of c$omp parallel.
Inside subroutine sum, the c$omp do should permit reduction clause on summ, making thread copy of summ on stack inside loop, then reducing to summ (total) on exit of loop.
I will test compile here in a few minutes.
Jim
This example compiles for me too.
When I try to do the same in my code, the code compiles but when I run the code I get inconsistent results.
That is for every repeated run with all the parameters remaining same (nothing at all changes), the reduction clause outputs different results.
I am not sure why this is happening. Is it because I have ifort 10.0?
Very possible that 10.0 is causing the non-reproducibility. There was a change in the ifort 11.0 compiler that fixes the global stack address, which affects alignment of data. Linux allows the global stack starting address to vary for processes. There are 2 possible fixes: rebuilding a linux kernel after tweaking a kernel param OR having the compiler fix the global stack at a fixed address. Ifort 11.0 chose the second option.
Thus I highly encourage moving to ifort 11.0. Keep in mind, THIS MAY NOT FIX WHAT YOU ARE SEEING. There may be something else going on. But by moving to 11.0, you remove one free variable.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think that Intel compiler 11.0.084 has a problem with regards to reducation clause in OpenMP parallel region.
Consider the sample code below. Logically this should not compile but for Intel compiler it does. This seems like a bug to me? (I do set OMP_NUM_THREADS greater than 1)
[cpp] module data real,dimension(:,:,:,:),allocatable :: phi end module data program test USE data real counter integer i,j,k,m allocate(phi(10,10,10,10)) counter = 1.0 c$omp parallel c$omp do private(i,j,k,m) do m=1,10 do k=1,10 do j=1,10 do i=1,10 phi(i,j,k,m)=counter enddo enddo enddo enddo c$omp enddo call tester c$omp end parallel end program test subroutine tester real summ_temp interface subroutine summation(summ) real summ end subroutine summation end interface summ_temp = 1.0 call summation(summ_temp) write(*,*)'sum is:',summ_temp end subroutine tester subroutine summation(summ) USE data real summ c$omp do private(i,j,k,m) c$omp+ reduction(+:summ) do m=1,10 do k=1,10 do j=1,10 do i=1,10 summ = summ + phi(i,j,k,m) enddo enddo enddo enddo c$omp enddo end subroutine summation[/cpp]

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page