Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

common block problem in openmp Fortran

s_f_
Beginner
2,168 Views
my code is:

 program
 ...
 ! Loop which I want to parallelize
 !$OMP parallel DO
 I = 1, N
 ...
 call FORD(i,j)
  ...
 !$OMP END parallel DO
 end program

  subroutine FORD(i,j)
  logical servo,genflg,lapflg
  dimension c(nvarc)
  dimension zl(3),zg(3),v1(3,3),v2(3,3),rn(3),
 .          rcg1(3),rcg2(3),ru1(3),ru2(3),
 .          rott1(3),rott2(3),velr(3),dt(3),
 .          dfs(3),ftu(3),fnr(3),amomet(3
  common /contact/ iab11,iab22,xx2,yy2,zz2,
 .                 ra1,rb1,rc1,ra2,rb2,rc2,
 .                 v1,v2,
 .                 xg1,yg1,zg1,xg2,yg2,zg2
  common /ellip/ b1,c1,f1,g1,h1,d1,
 .               b2,c2,f2,g2,h2,p2,q2,r2,d2
  common /root/ root1,root2
  common /tab1/
 .       itype(ndim1),nconti(5),nvarc,
 .       nconta,nconta1
  common /bal1/
 .       ra(5),rb(5),rc(5),
 .       amomen(ndim),fwall(6),press(3),wmomet(6,2),
 .       rot(ndim),ttheta(ndim*3),rstp(ndim*3),forces(ndim),
 .       ssampl(3,3),edserv(3,3),tdisp(ndim),adisp(ndim),vel(ndim),
 .       del(3),xmax(3)

CALL CONDACT(genflg,lapflg)
return
end subroutine

SUBROUTINE CONDACT(genflg,lapflg)
implicit double precision (a-h,o-z)
logical rflag,dflag,error,gmvflg,grvflg,ctrlflg,depflg
  parameter (ndim1 = 20002)
  parameter (ndim = 3*ndim1)
  parameter (nkmm = 9000000)
  parameter (nkwall = 50000)
  character*4 hed
  logical genflg,lapflg,fast
  dimension v1(3,3),v2(3,3)
  common /contact/ iab11,iab22,xx2,yy2,zz2,
 .                 ra1,rb1,rc1,ra2,rb2,rc2,
 .                 v1,v2,
 .                 xg1,yg1,zg1,xg2,yg2,zg2
  common /ellip/ b1,c1,f1,g1,h1,d1,b2,c2,f2,g2,h2,p2,q2,r2,d2
  common /switch/ nk
  common /root/ root1,root2
  common /nroot/ rt(5),nrt
  common /bal2/xmax(3)

 call function f(x)
 C
 C...... 
 C

 RETURN
 END

  function f(x)
  implicit double precision (a-h,o-z)
  common /contact/ iab11,iab22,xx2,yy2,zz2,
 .                 ra1,rb1,rc1,ra2,rb2,rc2,
 .                 v1,v2,
 .                 xg1,yg1,zg1,xg2,yg2,zg2
  common /ellip/ b1,c1,f1,g1,h1,d1,
 .               b2,c2,f2,g2,h2,p2,q2,r2,d2
  common /switch/ nk
  common /nroot/ rt(5),nrt
  dimension a(3,3),b(3),v1(3,3),v2(3,3)
  ..
  ..
  ..
  ..
  end

 

my question is inside the parallel loop, does all variable (within common block or outside of common block) in each subroutine are private? 1. If not, should I use threadprivate for the common blocks and private the variables in each subroutine after declaration? 2. Each thread passes through 2 subroutine and one function. Subroutines has some same common block and variables. if I use threadprivate common blocks for each subroutine, how do the variable values pass through the entire program for a single thread. Any help will be appreciated. Thanks.

0 Kudos
3 Replies
TimP
Honored Contributor III
2,168 Views
As written, the common blocks will be shared. Setting them threadprivate is a possibility, but simply making them local arrays and making the called procedures internal with contains looks more appealing. Then either RECURSIVE procedure declaration or qopenmp compiler option will make them private.
0 Kudos
jimdempseyatthecove
Honored Contributor III
2,168 Views

>> if I use threadprivate common blocks for each subroutine, how do the variable values pass through the entire program for a single thread

Threadprivate variables are instantiated into a thread context block of memory. The technique of which is left to the implementation. From the programmer's viewpoint, the threadprivate variables have an automatic (hidden) base/structure pointer which is used in dereferencing the address of the variable. The main thread has a pointer to the linker constructed area, whereas the other threads have pointers elsewhere. Think of this as a system-wide "this" pointer. Some implementations on IA processors use the FS or GS Segment/Selector register (faster access) where others use a more complex means (you can observe which by looking at the Disassembly window while debugging).

Threadprivate can be a good migration path, but care needs to be taken as to what needs to be shared and what needs to be private as well as how you do reductions (e.g. summation, min, max, found, etc...)

An alternative is to create a user defined type containing the context that you wish to remain private, and then adding this to the subroutine and function calls (after initialization at thread start).

Jim Dempsey

0 Kudos
s_f_
Beginner
2,168 Views

Thanks a lot Tim Prince and Jim Dempsey. I have attached my program in this thread. I have tried threadprivate for the common blocks that I have used inside the parallel loop->subroutines. But it gave me segmentation failure. (does the threadprivate common block uses huge memory space/stack?).

I am stuck at this point. I have no idea how to proceed. Any help will be appreciated. 

0 Kudos
Reply