Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29249 Обсуждение

ifx and ifort comparison for do concurrent

Fortran10
Новичок
1 641Просмотр.

below is my do concurrent code:

 

Question 1) Why the code does not run in parallel with ifx? Apparently, I do not see any error or warning !

 

program dc_test
  implicit none
  
  !====
  integer(4),parameter::Nx=300,Ny=300
  real(8) :: x=0.03d0,y=0.03d0,t_final
  real(8), parameter :: one=1.0,two=2.0,four=4.0,half=0.5
  integer(4) ::steps,i,j,ip,im,jp,jm,t_start,t_end,rate
  real(8):: dt=1.0d-4,tau=0.0003d0,ep=0.01d0,ka=1.8d0,seed=5.0d0,ppo,t1,t2,th,m
  real(8):: del=0.02d0,an=6.0d0,al=0.9d0,ga=10.0d0,te=1.0d0,t0=0.2d0,pix=four*atan(one)
  real(8),dimension(Nx,Ny) :: pp,tt,lpp,ltt,ppx,ppy,eps,deps
  
  !====
  pp = 0.0
  tt = 0.0
  
  do i = 1, Nx
    do j = 1, Ny
      if ((i-Nx/two)*(i-Nx/two)+(j-Ny/two)*(j-Ny/two)<seed)pp(i,j)=one
    end do
  end do
  
  !====
  call system_clock (count=t_start, count_rate=rate)
  
  do steps = 1,1000
    
    do concurrent (integer::j=1:Ny,i=1:Nx) default (none)             &
      local ( ip,im,jp,jm,th )                                        &
      shared ( x,y,lpp,ltt,pp,tt,ppx,ppy,eps,deps,t0,an,del,ep )              
      
      jp = j + 1
      jm = j - 1
      ip = i + 1
      im = i - 1
      
      if ( im == 0 ) im = Nx
      if ( ip == ( Nx + 1) ) ip = 1
      if ( jm == 0 ) jm = Ny
      if ( jp == ( Ny + 1) ) jp = 1
      
      lpp(i,j) = (pp(ip,j)+pp(im,j)+pp(i,jm)+pp(i,jp)-four*pp(i,j))/(x*y)
      ltt(i,j) = (tt(ip,j)+tt(im,j)+tt(i,jm)+tt(i,jp)-four*tt(i,j))/(x*y)
      
      ppx(i,j) = (pp(ip,j) - pp(im,j))/x
      ppy(i,j) = (pp(i,jp) - pp(i,jm))/y
      
      th  = atan2( ppy(i,j),ppx(i,j) )
      
      eps(i,j)  =  ep*(one+del*cos(an*(th-t0)))
      deps(i,j) = -ep*an*del*sin(an*(th-t0))
      
    end do
    
    do concurrent (integer::j=1:Ny,i=1:Nx) default (none)            &
      local( i,j,ip,im,jp,jm,ppo,t1,t2,m)                                 &
      shared(x,y,pp,tt,eps,deps,ppx,ppy,lpp,ltt,al,pix,te,ga,dt,tau,ka)
      
      jp = j + 1
      jm = j - 1
      ip = i + 1
      im = i - 1
      
      if ( im == 0 ) im = Nx
      if ( ip == ( Nx + 1) ) ip = 1
      if ( jm == 0 ) jm = Ny
      if ( jp == ( Ny + 1) ) jp = 1
      
      ppo = pp(i,j)
      
      t1 =  ( eps(i,jp)*deps(i,jp)*ppx(i,jp) - eps(i,jm)*deps(i,jm)*ppx(i,jm) ) / y
      t2 = -( eps(ip,j)*deps(ip,j)*ppy(ip,j) - eps(im,j)*deps(im,j)*ppy(im,j) ) / x
      
      m = al/pix*atan(ga*(te-tt(i,j)))
      
      pp(i,j) = pp(i,j)+(dt/tau)*(t1+t2+eps(i,j)**2*lpp(i,j) ) &
      +  ppo*(one-ppo)*(ppo-half+m)
      tt(i,j) = tt(i,j)+dt*ltt(i,j)+ka*(pp(i,j)-ppo)
      
    end do  
    
  end do
  
  call system_clock (count=t_end)
  t_final = real(max(t_end-t_start,1_8))/real(rate)
  
  print*, t_final  
end program dc_test

 

 

ifx test

No parallelization to be seen with the output time

 

>ifx main_dc.f90 /Qopenmp /F500000000
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.2.0 Build 20240602
Copyright (C) 1985-2024 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.32.31332.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:main_dc.exe
-subsystem:console
-stack:500000000
-defaultlib:libiomp5md.lib
-nodefaultlib:vcomp.lib
-nodefaultlib:vcompd.lib
main_dc.obj

>main_dc
   4.00899982452393

 

 

ifort test

parallelization can be confirmed with the output time

 

>ifort main_dc.f90 /Qopenmp /F500000000
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.13.0 Build 20240602_000000
Copyright (C) 1985-2024 Intel Corporation. All rights reserved.

ifort: remark #10448: Intel(R) Fortran Compiler Classic (ifort) is now deprecated and will be discontinued late 2024. Intel recommends that customers transition now to using the LLVM-based Intel(R) Fortran Compiler (ifx) for continued Windows* and Linux* support, new language support, new language features, and optimizations. Use '/Qdiag-disable:10448' to disable this message.
Microsoft (R) Incremental Linker Version 14.32.31332.0
Copyright (C) Microsoft Corporation. All rights reserved.

-out:main_dc.exe
-subsystem:console
-stack:500000000
-defaultlib:libiomp5md.lib
-nodefaultlib:vcomp.lib
-nodefaultlib:vcompd.lib
main_dc.obj

>main_dc
0.282000005245209

 

 

Question 2) According to the intel documentation, rules for variable-name in a locality-spec: "variable-name can not be the same as index-name of the same do concurrent statement." So the code is expected to show error or warning because in the concurrent statement

 

      do concurrent (integer::j=1:Ny,i=1:Nx) default (none) &
      local( i,j,ip,im,jp,jm,ppo,t1,t2,m) 
      ...

 

i and j are index-name, I think.

 

Question 3)  The option /Qopt-report-phase:openmp does not work with ifx but with ifort. So how to get the report to check if ifx has successfully parallelized the code?

Метки (3)
0 баллов
4 Ответы
Andrew_Smith
Ценный участник I
1 565Просмотр.

The amount of work inside the do concurrent loop is probably too small compared the overhead of parallel running so it actually runs slower. You did not show comparative times for serial and parallel to prove your assumption

Fortran10
Новичок
1 543Просмотр.

The output shows the time for both tests. It is the last line in the output. 

Here I it write again

ifort time = 0.282000005245209

ifx time    = 4.00899982452393

 

Andrew_Smith
Ценный участник I
1 488Просмотр.

I used task manager and confirmed that without using the /Qopenmp flag, DO CONCURRENT does not produce parallel code.

I had to up the stack size when using /Qopenmp, no idea why as the thread data should be small as only scalar variable are declared as LOCAL

These are my results:

 SerialParallel (/Qopenmp)
IFX2.7079999454.796
IFORT2.0929999350.81


Task manager confirmed that the IFX parallel version did not run in parallel but the IFORT version did.

Devorah_H_Intel
Модератор
1 444Просмотр.

For Question # 3
See  the Intel Fortran Compiler Porting Guide

As well as other useful information on ifx.  
 

Ответить