Solved: OpenMP fortran code seems to work but perhaps shouldn't

Vahid9 · ‎12-26-2023

Hi,

The following code gives identical output serially and with OpenMP (my first OpenMP code). I am trying to understand if this is pure luck or not. The code does not have CRITICAL or ATOMIC constructs. Should such constructs be used in this code to avoid data racing? The agreement between the serial and parallel versions suggests no data racing but I have a feeling that I am missing something here.

Are constructs such as CRITICAL or ATOMIC needed here?

Thanks,

Vahid

!       program SEE.f90

        use omp_lib

        implicit none
                
        INTEGER, PARAMETER :: DP = selected_real_kind(14,200)
        REAL(DP), PARAMETER :: zero   = 0.d0
        REAL(KIND=DP), parameter :: kb=8.6173324d-02 ! meV/K
        INTEGER, PARAMETER :: Nk=9936
        INTEGER, PARAMETER :: Nk_BZ=216000
        INTEGER, PARAMETER :: nbnd=4
        INTEGER ::  i,j,dummy,m,n, eqv(Nk_BZ)
        REAL(KIND=DP)    :: deltaE,tot_delta,delta, inv_tau_delta, tot_inv_tau_delta
        REAL(KIND=DP)    :: temp,  degaussw, E, kT
        REAL(KIND=DP), allocatable   :: SKE(:,:,:), ek(:,:)
        REAL(KIND=DP), allocatable    :: SEE(:,:)
        REAL, EXTERNAL :: w0gauss

        temp=300.0
        kT=kb*temp ! meV
        
        allocate (SKE(Nk,nbnd,0:3100))
        allocate (ek(Nk,nbnd))
        allocate (SEE(0:3100,0:3100))

        OPEN(10,file="SKE_new.dat")
        OPEN(15,file="EPW_scatt_rates.dat")
        OPEN(14,file="bz2wedge_k.dat")
        OPEN(20,file="SEE.dat")

        ! Skip header
        READ(10,*)
        READ(14,*)

        DO i=1,Nk
        DO j=1,nbnd
        DO m=0,3100
          READ(10,*) dummy,SKE(i,j,m)
        ENDDO
          READ(15,*) ek(i,j), dummy
        ENDDO
        ENDDO

        ! Read the mapping between loop index and eqv(i)
        DO i=1,Nk_BZ
          READ(14,*) dummy, eqv(i)
        ENDDO

        degaussw=0.01

        deltaE=1.0d-3
        
        SEE(:,:)=zero

             !$omp parallel do shared (SEE,degaussw,SKE,ek,eqv) private (i,j,n,m,E,delta,inv_tau_delta,tot_delta,tot_inv_tau_delta)

        DO m=0,3100
         DO n=0,3100
           
           tot_delta=zero
           tot_inv_tau_delta=zero
           E=n*DeltaE

              DO i=1,Nk_BZ
               DO j=1,nbnd
               delta = w0gauss((E-ek(eqv(i),j))/degaussw)/degaussw
               inv_tau_delta=SKE(eqv(i),j,m)*delta
               tot_delta=tot_delta+delta
               tot_inv_tau_delta=tot_inv_tau_delta+inv_tau_delta
               ENDDO ! j
              ENDDO ! i

              SEE(n,m)=tot_inv_tau_delta/tot_delta

          ENDDO ! n
         ENDDO ! m
 
        !$omp end parallel do
        
         DO n=0,3100
          DO m=0,3100
           WRITE(20,"(2I8,ES18.8E3)") n, m,  SEE(n,m)
         
          ENDDO ! n
         ENDDO ! m

         deallocate(SEE)
         deallocate(SKE)
         deallocate(ek)
        
         CLOSE(10)
         CLOSE(14)
         CLOSE(15)
         CLOSE(20)
         
    STOP
    END

FUNCTION w0gauss (x)
  IMPLICIT NONE
  INTEGER, PARAMETER :: DP = selected_real_kind(14,200)
  REAL(DP), PARAMETER :: sqrtpi = 1.77245385090551602729_DP
  REAL(DP), PARAMETER :: sqrtpm1= 1.0_DP / sqrtpi
  REAL(DP) :: w0gauss, x, arg

  arg = min (200.d0, x**2)
  w0gauss = exp ( - arg) * sqrtpm1
  return
END FUNCTION

jimdempseyatthecove · ‎12-26-2023

Of the shared variables (SEE,degaussw,SKE,ek,eqv), only the array SEE is written to, .AND. one of its subscripts is the loop control variable of the parallel loop. Therefore, each thread writes to different locations (elements) of the array SEE. IOW, the parallel loop is thread safe.

The function w0gauss is also thread-safe.

Jim Dempsey

View solution in original post

jimdempseyatthecove · ‎12-26-2023

Of the shared variables (SEE,degaussw,SKE,ek,eqv), only the array SEE is written to, .AND. one of its subscripts is the loop control variable of the parallel loop. Therefore, each thread writes to different locations (elements) of the array SEE. IOW, the parallel loop is thread safe.

The function w0gauss is also thread-safe.

Jim Dempsey

Vahid9 · ‎12-26-2023

Thanks Jim. I now understand why the code works.

Best,

Vahid