Showing results for 
Search instead for 
Did you mean: 

OpenMP Fortran Windous 10

I have Fortran code which works fine when I compile it for 32bit computer Windous 10, but it does not work when I compile it for 64-bit Windous 10 computer.  In a 64-bit compute it just stops at:   !$OMP DO SCHEDULE(STATIC,chunk)        

These are the switches I use:

ifort a.f90 libiomp5md.lib /heap-arrays /assume:byterecl /assume:buffered_io /Qip- /Ob0 /Qopenmp /auto-scalar /exe:a.exe 


subroutine colsol(a,v,ColTop,ColDONE,maxa,nn,kkk,na,nn1,ierr)
!   **************************************************************
!   *   Cholesky  Factorisation
!   ***************************************************************    
      implicit none
      real*8 a(na),v(nn),b,c
      integer*4 maxa(nn1),nn,l,n,kk,ic,nd,ki,j,k,nn1,na,kh,kl,kn,i_cnt,i_cnt_old, klt,ku,kkk,ierr
      real*8 sum1, amaxak
      integer *4  ColTop(nn),ColDONE(nn)     !...Cholesky
      integer *4  i,TOPij, chunk
      integer *4  iperct,iperct1, maxai, maxaj
       !...prepare 'ColTop'   
       do i = 1, nn
           ColTop(i) = i - (maxa(i + 1) - maxa(i)) + 1
       end do  
       !...Columns Done
       do i = 1, nn
           ColDONE(i) = 0   !... mark all columns as not done '0'
       end do  
       !...factorisation  (Skyline)
        a(1) = dSqrt(a(1))
        ColDONE(1) = 1   !...colum 1 is done
!$OMP PARALLEL PRIVATE (i,j,k,maxaj,maxai,sum1,amaxak,TOPij) 
!$OMP DO SCHEDULE(STATIC,chunk)        
        do j = 2, nn                   !...loop for COL from 2 to nn
           maxaj=maxa(j) + j 
           do i = ColTop(j), j - 1     !...loop for ROW from top going down to diagonal

                !...wait intill Colum 'i' is done    
                do while(ColDONE(i) .ne. 1)
                end do
                sum1 = 0.0d0
                TOPij = Max(ColTop(i), ColTop(j))     !...find min column height for dot product
                maxai=maxa(i) + i 
                do k = TOPij, i - 1
                    sum1 = sum1 + a(maxai- k) * a(maxaj - k)
                end do
                a(maxaj - i) = (a(maxaj - i) - sum1) / a(maxai - i)
            end do
            ! diagonal term J separatelly
            sum1 = 0.0d0
            do k = ColTop(j), j-1
                amaxak=a(maxaj - k)
                sum1 = sum1 + amaxak * amaxak
            end do
            a(maxa(j)) = dSqrt(a(maxa(j)) - sum1)
            ColDONE(j) = 1    !...colum 'j' is done
        end do



0 Kudos
2 Replies
New Contributor II

The problem could be either with the compiler options you are now using or changes to the compiler's approach to optimisation for these options.

ifort could be modifying the DO WHILE loop, as there is nothing "changing" in the loop.

You may be better of selecting a lower optimisation and replacing the inner loops with dot_product or an optimised vector routine.

This is an interesting approach to COLSOL / omp. Why use Cholesky, as "a(maxa(j)) = dSqrt(a(maxa(j)) - sum1)" requires a positive definite matrix, while other COLSOL approaches do not ? I would be interested to know the history of this routine, as it has a backwards storage order for A.


0 Kudos
New Contributor II

I adapted your approach to a COLSOL - Crout solver and found a problem with your DO WHILE loop being optimised away. I did get it to work with a more complex wait loop and included a timer call for the first wait cycle:

      DO Jeq = JB,JT
!       Wait until this column is complete
         iw = 0        
           if ( NA_done(JEQ) ) exit
           call small_delay (iw)
           iw = iw + 1
         END DO
  subroutine small_delay (iw)
      integer*4 :: iw
      integer*8 :: tick
      integer*8 QueryPerformance_tick
      external  QueryPerformance_tick
      if ( iw==0 ) tick = QueryPerformance_tick ()
  end subroutine small_delay 

Your OMP solver approach works well for small problems but becomes constrained by a cache - memory bottleneck for larger problems.

0 Kudos