problem in serial region of openmp problem

roddur · ‎02-16-2009

As my last post, i have found the problemetic area and hopefully you have not got irritated with me and this problem.
Here is the part of my code which i think relevent!

: !$omp parallel do default(shared) & !$omp private(il,ienrp,istart,iend,iie,xxa,yya) & !$omp private(xa,ya,e) orbital: do il=1,lorbit-2,2 xa=0.00;ya=0.00 write(*,'(1x,"Starting orbital loop",1x,i2,$)') il ienrp=0 e=emin-de istart=1 iend=lorbit ! do iend=9,63,lorbit 111 continue e=e+de iie=0 do ie=istart,iend iie=iie+1 p2(iie)=ap2(ie) p3(iie)=ap3(ie) p4(iie)=ap4(ie) enddo ienrp=ienrp+1 call hop(il,e,ienrp,map,srl,ap1, & ap6,ap7,ap8,ap9,ap10,ap11, & ap12,ap13,p2,p3,p4,xa,ya) istart=iend+1 iend=iend+lorbit write(*,'(".",$)') ! end do if (iend<=npn)goto 111 call fit(xa,ya,seed,xxa,yya,temp,il) call spectral(il,xxa,yya,temp,spec) write(*,'("done")') end do orbital !$omp end parallel do call tdos(spec,ipq,nsp,dos, & sorb,porb,dorb,osd1,osd2) !***********************************************! ! Writing DOS to the files ! !***********************************************! ! call ldos(dos_aa,dos_b,dos_adn,dos_bdn,temp,T_dos) call fermi(temp,T_dos,ef) call band(temp,T_dos,ef,e_band) call pardos(temp,sorb,porb,dorb,ef,nit,e_band, & qtot_a,qtot_b,qtot_a1,qtot_b1)

it calls the subroutine pardos:

: !*******************************************************! ! This program calculates the partial dos and ! MAGMOM and written by Kartickda. ! I will update it latter! ! No documentation meanwhile !*******************************************************! subroutine pardos(temp,sorb,porb,dorb,ef,nit,e_band, & qtot_a,qtot_b,qtot_a1,qtot_b1) use parameters implicit double precision(a-h,o-z) dimension temp(nfn+1),sorb(2,nfn+1,2), & porb(2,nfn+1,2), & dorb(2,nfn+1,2),a(3) do nsp=1,2 ia=-1 ib=-1 qtot_a=0.0d0 qtot_b=0.0d0 do ij=1,2 if(ij.eq.1)then yx=x open(1,file='POTPAR_A',status='old') read(1,*)a(1) read(1,*)a(2) read(1,*)a(3) if(nsp.eq.2)then read(1,*)a(1) read(1,*)a(2) read(1,*)a(3) endif close(1) else yx=y open(1,file='POTPAR_B',status='old') read(1,*)a(1) read(1,*)a(2) read(1,*)a(3) if(nsp.eq.2)then read(1,*)a(1) read(1,*)a(2) read(1,*)a(3) endif close(1) endif in=1 do ii=1,3 ! s=dreal(ii) call mom(temp,sorb,a,ii,ef,ij,in,yx,ia,ib,qtot_a,qtot_b, & qtot_a1,qtot_b1,e_band,nit,nsp) enddo in=in+1 do ii=1,3 ! s=dreal(ii) call mom(temp,porb,a,ii,ef,ij,in,yx,ia,ib,qtot_a,qtot_b, & qtot_a1,qtot_b1,e_band,nit,nsp) enddo in=in+1 do ii=1,3 !c s=dreal(ii) call mom(temp,dorb,a,ii,ef,ij,in,yx,ia,ib,qtot_a,qtot_b, & qtot_a1,qtot_b1,e_band, nit,nsp) enddo !c in=in+1 !c do ii=1,3 !c s=dreal(ii) !c call mom(temp,forb,a,ii,ef,ij,in,yx,ia,ib,qtot_a,qtot_b, !c . qtot_a1,qtot_b1, !c . e_band, nit,nsp) !c enddo enddo enddo end subroutine pardos ! CONTAINS subroutine mom(temp,dos1,a,ii,ef,ij,in,yx,ia,ib,qtot_a, & qtot_b,qtot_a1,qtot_b1,e_band,nit,nsp) use parameters implicit double precision(a-h,o-z) dimension temp(nfn+1),a(3),dos1(2,nfn+1,2),A_MOM(0:8,2), & B_MOM(0:8,2),ECG_A(3,2),ECG_B(3,2) de=temp(2)-temp(1) de=dabs(de) sum1=0.0d0 sum11=0.0d0 !*****************Two calculate ECG************************************* !***************** ECG=(band energy) /(charge in that band)***************** if(ii.eq.3)then do i=1,nfn+1 sum11=sum11+yx*dos1(ij,i,nsp)*de*temp(i) if(temp(i).gt.ef) then y1=dos1(ij,i-1,nsp) y2=dos1(ij,i,nsp) x1=temp(i-1) x2=temp(i) xx=ef yb=y1+((xx-x1)*(y2-y1))/(x2-x1) de=dabs(ef-temp(i-1)) sum11=sum00+yx*yb*de*xx ! here de must be different from previous goto 1000 endif sum00=sum11 enddo endif 1000 continue !********************************************************************* do i=1,nfn+1 sum1=sum1+yx*dos1(ij,i,nsp)*de*(temp(i)-a(in))**(ii-1) if(temp(i).gt.ef) then y1=dos1(ij,i-1,nsp) y2=dos1(ij,i,nsp) x1=temp(i-1) x2=temp(i) xx=ef yb=y1+((xx-x1)*(y2-y1))/(x2-x1) de=dabs(ef-temp(i-1)) sum1=sum0+yx*yb*de*(xx-a(in))**(ii-1) ! here de must be different from previous goto 100 endif sum0=sum1 enddo 100 continue value=sum1 if(ij.eq.1.and.ii.eq.3)ECG_A(in,nsp)=(1.0d0/x)*sum11 if(ij.eq.2.and.ii.eq.3)ECG_B(in,nsp)=(1.0d0/y)*sum11 write(*,'(4(f8.3),1x,i1,1x,i1)') ECG_A(in,nsp),x,sum11,in,nsp if(ij.eq.1)then ia=ia+1 if(ia/3*3.eq.ia)qtot_a=qtot_a+value A_MOM(ia,nsp)=(1.0d0/x)*value endif if(ij.eq.2)then ib=ib+1 B_MOM(ib,nsp)=(1.0d0/y)*value if(ib/3*3.eq.ib)qtot_b=qtot_b+value endif !c if(ij.eq.2.and.ib.eq.11.and.nsp.eq.2)then if(ij.eq.2.and.ib.eq.8.and.nsp.eq.2)then open(2,file='/home/rudra/Recursion/LMTO_A/MAINA/EB_A',status='unknown') do ii=1,2 do ikk=1,3 write(2,*)ECG_A(ikk,ii) enddo enddo close(2) open(2,file='/home/rudra/Recursion/LMTO_B/MAINA/EB_B',status='unknown') do ii=1,2 do ikk=1,3 write(2,*)ECG_B(ikk,ii) enddo enddo close(2) ECG_A(1,1)=ECG_A(1,1)/A_MOM(0,1) ECG_A(2,1)=ECG_A(2,1)/A_MOM(3,1) ECG_A(3,1)=ECG_A(3,1)/A_MOM(6,1) ECG_B(1,1)=ECG_B(1,1)/B_MOM(0,1) ECG_B(2,1)=ECG_B(2,1)/B_MOM(3,1) ECG_B(3,1)=ECG_B(3,1)/B_MOM(6,1) ECG_A(1,2)=ECG_A(1,2)/A_MOM(0,2) ECG_A(2,2)=ECG_A(2,2)/A_MOM(3,2) ECG_A(3,2)=ECG_A(3,2)/A_MOM(6,2) ECG_B(1,2)=ECG_B(1,2)/B_MOM(0,2) ECG_B(2,2)=ECG_B(2,2)/B_MOM(3,2) ECG_B(3,2)=ECG_B(3,2)/B_MOM(6,2) qtot_a=A_MOM(0,1)+A_MOM(3,1)+A_MOM(6,1) qtot_a=qtot_a-A_MOM(0,2)-A_MOM(3,2)-A_MOM(6,2) qtot_b=B_MOM(0,1)+B_MOM(3,1)+B_MOM(6,1) qtot_b=qtot_b-B_MOM(0,2)-B_MOM(3,2)-B_MOM(6,2) dmag=x*qtot_a+y*qtot_b qtot_aa=A_MOM(0,1)+A_MOM(3,1)+A_MOM(6,1) qtot_aa=qtot_aa+A_MOM(0,2)+A_MOM(3,2)+A_MOM(6,2) qtot_bb=B_MOM(0,1)+B_MOM(3,1)+B_MOM(6,1) qtot_bb=qtot_bb+B_MOM(0,2)+B_MOM(3,2)+B_MOM(6,2) a_mag=qtot_a b_mag=qtot_b open(1,file='data_ef',status='unknown',access='append') write(1,79)nit,ef,x*qtot_a,y*qtot_b,dmag,x*qtot_aa,y*qtot_bb close(1) 79 format(i4,2x,6(f19.13,2x)) qtot_a=A_MOM(0,1)+A_MOM(3,1)+A_MOM(6,1) qtot_a=qtot_a+A_MOM(0,2)+A_MOM(3,2)+A_MOM(6,2) qtot_b=B_MOM(0,1)+B_MOM(3,1)+B_MOM(6,1) qtot_b=qtot_b+B_MOM(0,2)+B_MOM(3,2)+B_MOM(6,2) qtot_a=x*(qtot_a+c_a-z_a) qtot_b=y*(qtot_b+c_b-z_b) w_a=2.5952955d0 w_b=w_a e_mad_A=x*y*ruban*2.0d0*qtot_a*qtot_a/w_a e_mad_B=e_mad_A open(1,file='MAD',status='unknown',access='append') write(1,7979)nit,e_mad_A close(1) 7979 format(i4,2x,f19.13) vm_a=qtot_a*ruban/w_a vm_b=qtot_b*ruban/w_b if(nit.eq.0)then qtot_a1=0.0d0 qtot_b1=0.0d0 endif vm_a1=(amix*qtot_a1+(1.0d0-amix)*qtot_a)*ruban/w_a vm_b1=(amix*qtot_b1+(1.0d0-amix)*qtot_b)*ruban/w_b !c vm_a1=amix*vm_a+(1.0d0-amix)*qtot_a1 !c vm_b1=amix*vm_b+(1.0d0-amix)*qtot_b1 qtot_a1=qtot_a qtot_b1=qtot_b !c qtot_a1=vm_a !c qtot_b1=vm_b open(1,file='/home/rudra/Recursion/LMTO_A/ATOM/CH_A',status='unknown') open(2,file='/home/rudra/Recursion/LMTO_B/ATOM/CH_B',status='unknown') write(1,*)a_mag write(1,*)c_a write(1,*)vm_a write(1,*)vm_a1 write(2,*)b_mag write(2,*)c_b write(2,*)vm_b write(2,*)vm_b1 close(1) close(2) open(1,file='/home/rudra/Recursion/LMTO_A/BNDASA/ASM_A',status='unknown') do ii=1,2 do ik=0,8 write(1,*)A_MOM(ik,ii) enddo enddo close(1) open(1,file='C_A',status='unknown',access='append') write(1,790)nit,A_MOM(0,1),A_MOM(3,1),A_MOM(6,1), & A_MOM(0,2),A_MOM(3,2),A_MOM(6,2), & B_MOM(0,1),B_MOM(3,1),B_MOM(6,1), & B_MOM(0,2),B_MOM(3,2),B_MOM(6,2) close(1) open(1,file='/home/rudra/Recursion/LMTO_B/BNDASA/ASM_B',status='unknown') do ii=1,2 do ik=0,8 write(1,*)B_MOM(ik,ii) enddo enddo close(1) !c open(1,file='C_B',status='unknown',access='append') !c write(1,790)nit,(B_MOM(ik),ik=0,8) !c close(1) open(1,file='/home/rudra/Recursion/LMTO_A/MAINA/ECG_A',status='unknown') do ii=1,2 do ik=1,3 write(1,*)ECG_A(ik,ii) write(*,*)ECG_A(ik,ii),'ECG_A' enddo enddo close(1) open(1,file='/home/rudra/Recursion/LMTO_B/MAINA/ECG_B',status='unknown') do ii=1,2 do ik=1,3 write(1,*)ECG_B(ik,ii) write(*,*)ECG_B(ik,ii),'ECG_B' enddo enddo close(1) endif 790 format(i5,2x,17(f19.13,2x)) return end subroutine mom

the confusion is with the line bold "write(*,'(4(f8.3),1x,i1,1x,i1)') ECG_A(in,nsp),x,sum11,in,nsp"! on the line before, i have calculated ECG_A as sum11/x. In serial and parallel run, x and sum11 is same but ECG_A is going wrong for parallel run. As this pardos is called outside the parallel region and ECG_A is *NOT* used in parallel part, it is confusing for me. I am also giving the logfile showing the value of ECG_A,x and sum11 as the output of the given line.
Here is the output of parallel run:

and that for serial run is:

0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
-0.166231192935238 0.400000005960464 -6.649247816491034E-002
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
-0.129261884867856 0.400000005960464 -5.170475471760318E-002
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
-0.699020494933910 0.400000005960464 -0.279608202140051
-0.166231192935238 0.400000005960464 0.000000000000000E+000
-0.166231192935238 0.400000005960464 0.000000000000000E+000
-0.166231192935238 0.400000005960464 -7.268663541871814E-002
-0.129261884867856 0.400000005960464 0.000000000000000E+000
-0.129261884867856 0.400000005960464 0.000000000000000E+000
-0.129261884867856 0.400000005960464 -5.664037525321141E-002
-0.699020494933910 0.400000005960464 0.000000000000000E+000
-0.699020494933910 0.400000005960464 0.000000000000000E+000
-0.699020494933910 0.400000005960464 -0.290722991209875
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
-0.175131874417321 0.400000005960464 -7.005275081079569E-002
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
-0.135994161904373 0.400000005960464 -5.439766557233750E-002
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
0.000000000000000E+000 0.400000005960464 0.000000000000000E+000
-1.27376705297917 0.400000005960464 -0.509506828783912
-0.175131874417321 0.400000005960464 0.000000000000000E+000
-0.175131874417321 0.400000005960464 0.000000000000000E+000
-0.175131874417321 0.400000005960464 -7.264237921621851E-002
-0.135994161904373 0.400000005960464 0.000000000000000E+000
-0.135994161904373 0.400000005960464 0.000000000000000E+000
-0.135994161904373 0.400000005960464 -5.762860036718714E-002
-1.27376705297917 0.400000005960464 0.000000000000000E+000
-1.27376705297917 0.400000005960464 0.000000000000000E+000
-1.27376705297917 0.400000005960464 -0.290264316339248

(Cant manage the format...plz forgive me!) on the first column is ECG_A which depends on 2nd and 3rd column.....but its different in serial and parallel run though 2nd and 3rd column is identical.
Cau you people suggest me something?

jimdempseyatthecove · ‎02-17-2009

At start of program

ECG_A = -1.0
ECG_B = -1.0

And see if you get something different

The result 2.710823505304238E-312 is approximately equivilent to TINY()

You may have an uninitialize variable issue.

Jim Dempsey

rreis · ‎02-17-2009

in that case compiling with

-check uninit

wouldn't catch the culprit?

TimP · ‎02-17-2009

Quoting - rreis

in that case compiling with

-check uninit

wouldn't catch the culprit?

You and I wish. Have you been able to use -check uninit within a parallel region (without too many false positives)? Thread checker may catch some private uninitialized variables.

roddur · ‎02-17-2009

1 #FC=gfortran
2 FC=/opt/intel/Compiler/11.0/074/bin/intel64/ifort
3
4 ifeq ($(FC),gfortran)
5 CC=gcc
6 FFLAGS =-O3
7 FPAR=#-fopenmp
8 LFFLAG=-O3
9 CFLAG =-O3
10 else
11 ifeq ($(FC),/opt/intel/Compiler/11.0/074/bin/intel64/ifort)
12 CC=gcc
13 FFLAGS = -O3 -check uninit
14 LFFLAG= -O3 -static
15 FPAR=-openmp
16 CFLAG =#-O3

This is my compilar options: -check uninit is not giving any other information

jimdempseyatthecove · ‎02-18-2009

roddur,

I notice that in your parallel do you use ie as a loop control variable. ie has not been declared as private.
This may present a problem.

[cpp]iie=0
do ie=istart,iend ! *** ie is shared ***
  iie=iie+1
  p2(iie)=ap2(ie)
  p3(iie)=ap3(ie)
  p4(iie)=ap4(ie)
enddo
 
---- consider  using ---

iie = iend-istart+1
p2(1:iie) = ap2(istart:iend)
p3(1:iie) = ap3(istart:iend)
p4(1:iie) = ap4(istart:iend)

[/cpp]

Jim

roddur · ‎02-24-2009

hello jim,
i tried that one as well but not with much help. can i send you all the files? hopefully then you can suggest me something.
i dont want to put the whole code here. so will you plz allow me to send it you personally?

jimdempseyatthecove · ‎02-24-2009

Quoting - roddur

hello jim,
i tried that one as well but not with much help. can i send you all the files? hopefully then you can suggest me something.
i dont want to put the whole code here. so will you plz allow me to send it you personally?

I don't mind taking a quick look - assuming it compiles and includes any input files. I develop on Windows platform so there may be some issues in getting your make file to work. If you are using 3rd party libraries we may have a problem. I do have an Ubuntu Linuxsystem here, but I haven't gotten around to learning how to develop on it (documentation sucks).

Send to jim_dempsey@ameritech.net

Put "problem in serial region of openmp problem" in subject line.

Jim

Ron_Green · ‎02-24-2009

Jim,

I've updated a KB article on using Ubuntu with the Intel Compilers. The trick is that Ubuntu installs a rather standard Desktop linux environment. To get it ready for development, you have to load up gcc, g++ and the compatibility libs. The specific commands to do this are listed in the KB:

http://software.intel.com/en-us/articles/using-intel-compilers-for-linux-with-ubuntu/

jimdempseyatthecove · ‎02-24-2009

Quoting - Ronald Green (Intel)

Jim,

I've updated a KB article on using Ubuntu with the Intel Compilers. The trick is that Ubuntu installs a rather standard Desktop linux environment. To get it ready for development, you have to load up gcc, g++ and the compatibility libs. The specific commands to do this are listed in the KB:

http://software.intel.com/en-us/articles/using-intel-compilers-for-linux-with-ubuntu/

The problem is in trying to understand the IDE (placement of files etc... oops shouldn't use etc)

Comming from a Windows development much of what I view off the ide doesn't make sense.

When having an analog to a solution with multiple projects the IDE seems to want to stick everything in one folder. And the IDE won't let you create a folder on the fly. Something I am missing but this could be cleared up

With documentation with tutorial including "How to use IDE to create 'Hello World' program".
Then multi-project solutions.

Jim