- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have Fortran code which works fine when I compile it for 32bit computer Windous 10, but it does not work when I compile it for 64-bit Windous 10 computer. In a 64-bit compute it just stops at: !$OMP DO SCHEDULE(STATIC,chunk)
These are the switches I use:
ifort a.f90 libiomp5md.lib /heap-arrays /assume:byterecl /assume:buffered_io /Qip- /Ob0 /Qopenmp /auto-scalar /exe:a.exe
subroutine colsol(a,v,ColTop,ColDONE,maxa,nn,kkk,na,nn1,ierr) ! ************************************************************** ! * Cholesky Factorisation ! *************************************************************** implicit none real*8 a(na),v(nn),b,c integer*4 maxa(nn1),nn,l,n,kk,ic,nd,ki,j,k,nn1,na,kh,kl,kn,i_cnt,i_cnt_old, klt,ku,kkk,ierr real*8 sum1, amaxak integer *4 ColTop(nn),ColDONE(nn) !...Cholesky integer *4 i,TOPij, chunk integer *4 iperct,iperct1, maxai, maxaj !----------------------------------------------------------- ierr=0 iperct=0 iperct1=0 chunk=1 !...prepare 'ColTop' do i = 1, nn ColTop(i) = i - (maxa(i + 1) - maxa(i)) + 1 end do !...Columns Done do i = 1, nn ColDONE(i) = 0 !... mark all columns as not done '0' end do !--------------------------------------------------------------------------------- !...factorisation (Skyline) a(1) = dSqrt(a(1)) ColDONE(1) = 1 !...colum 1 is done !$OMP PARALLEL PRIVATE (i,j,k,maxaj,maxai,sum1,amaxak,TOPij) !$OMP DO SCHEDULE(STATIC,chunk) do j = 2, nn !...loop for COL from 2 to nn maxaj=maxa(j) + j do i = ColTop(j), j - 1 !...loop for ROW from top going down to diagonal !...wait intill Colum 'i' is done do while(ColDONE(i) .ne. 1) end do sum1 = 0.0d0 TOPij = Max(ColTop(i), ColTop(j)) !...find min column height for dot product maxai=maxa(i) + i do k = TOPij, i - 1 sum1 = sum1 + a(maxai- k) * a(maxaj - k) end do a(maxaj - i) = (a(maxaj - i) - sum1) / a(maxai - i) end do !...do diagonal term J separatelly sum1 = 0.0d0 do k = ColTop(j), j-1 amaxak=a(maxaj - k) sum1 = sum1 + amaxak * amaxak end do a(maxa(j)) = dSqrt(a(maxa(j)) - sum1) ColDONE(j) = 1 !...colum 'j' is done end do !$OMP END DO !$OMP END PARALLEL return end
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem could be either with the compiler options you are now using or changes to the compiler's approach to optimisation for these options.
ifort could be modifying the DO WHILE loop, as there is nothing "changing" in the loop.
You may be better of selecting a lower optimisation and replacing the inner loops with dot_product or an optimised vector routine.
This is an interesting approach to COLSOL / omp. Why use Cholesky, as "a(maxa(j)) = dSqrt(a(maxa(j)) - sum1)" requires a positive definite matrix, while other COLSOL approaches do not ? I would be interested to know the history of this routine, as it has a backwards storage order for A.
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I adapted your approach to a COLSOL - Crout solver and found a problem with your DO WHILE loop being optimised away. I did get it to work with a more complex wait loop and included a timer call for the first wait cycle:
DO Jeq = JB,JT ! ! Wait until this column is complete iw = 0 DO if ( NA_done(JEQ) ) exit call small_delay (iw) iw = iw + 1 END DO ... subroutine small_delay (iw) integer*4 :: iw integer*8 :: tick integer*8 QueryPerformance_tick external QueryPerformance_tick ! if ( iw==0 ) tick = QueryPerformance_tick () end subroutine small_delay
Your OMP solver approach works well for small problems but becomes constrained by a cache - memory bottleneck for larger problems.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page