hidden text to trigger early load of fonts ПродукцияПродукцияПродукцияПродукция Các sản phẩmCác sản phẩmCác sản phẩmCác sản phẩm المنتجاتالمنتجاتالمنتجاتالمنتجات מוצריםמוצריםמוצריםמוצרים
Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29027 Discussions

Does ifx support data transfer management for "DO CONCURRENT" now?

cu238
Novice
236 Visites

Recently,  I have rewritten a 3D nemerical model into GPU offload code, both with ifx and nvfortran.  With ifx, it uses OPENMP directives.  With nvfortran, it uses "DO CONCURRENT" standard Fortran. Running under Intel B580 GPU, the model is about 10x faster than serial code under Intel Core i5-12400F CPU, while under Nvidia A100 GPU it's about 25x. 

 

I didn't use "DO CONCURRENT" in ifx because it's extra slow, transfering data between host and device all the time.  To solve this issue, nvfortran gives 2 useful options: A. "-stdpar=gpu"; B."-stdpar=gpu -acc=gpu gpu=nomanaged".  Both of the options are fast.  Option A uses unified memory, automatically managing data transfer on need. Option B avoids most of the data transfer.

 

So does ifx support data transfer management for "DO CONCURRENT" now? Options like each of the 2 options above would be great. It saves a lot of coding time and makes ifx version and nvfortran version similar. I think option B is not hard for a compiler to achieve.

 

Wish it soon. Thanks!

 

0 Compliments
2 Réponses
Ron_Green
Modérateur
165 Visites

Yes.  See the option 

-fopenmp-do-concurrent-maptype-modifier[=modifier]

 

You will be responsible for the OMP data mappers.  See the example included in the documentation for this option

cu238
Novice
94 Visites

Great to see this happening!

 

I tested my code with all "do-concurrent" form and it fails. I did some debugs and narrow down it to a single loop as below. "forrtl: severe (157): Program Exception - access violation" follows " -> INFO_EXCH_GPU.0".

      ELSE
      print*,"-> INFO_EXCH_GPU.0"
          DO concurrent (J = 1:8,I = 1:N_CTRD_EVG)
            MASK4VAR(J,I) = MASK(NAG_EVG(J,I))
          ENDDO
      ENDIF
      print*,"-> INFO_EXCH_GPU.1"

After changing this part into original version below, "forrtl: severe (157): Program Exception - access violation" follows " -> INFO_EXCH_GPU.1".

      ELSE
      print*,"-> INFO_EXCH_GPU.0"
c          DO concurrent (J = 1:8,I = 1:N_CTRD_EVG)
c            MASK4VAR(J,I) = MASK(NAG_EVG(J,I))
c          ENDDO
!$OMP TARGET DEFAULTMAP(present: allocatable)
!$OMP TEAMS DISTRIBUTE PARALLEL DO COLLAPSE(2)
        DO I = 1,N_CTRD_EVG
          DO J = 1,8
            MASK4VAR(J,I) = MASK(NAG_EVG(J,I))
          ENDDO
        ENDDO
!$OMP END TARGET
      ENDIF
      print*,"-> INFO_EXCH_GPU.1"

 This part of the code is in SUBROUTINE INFO_EXCH_GPU. Before this SUBROUTINE is called, other "do-concurrent" reports no error. The difference is SUBROUTINE INFO_EXCH_GPU has arguments for GPU.  These arguments include MASK and NAG_EVG shown below.

	SUBROUTINE INFO_EXCH_GPU(VTYPE,KIN,VA,MASK,
     *NAG_EVG,WT_EVG,NAG_IVG,WT_IVG)
  
      USE MOD_GLOBAL
      IMPLICIT NONE
      INTEGER I,J,KIN
	INTEGER VTYPE
      INTEGER NAG_EVG(NMAX,N_CTRD_EVG),NAG_IVG(NMAX,N_CTRD_IVG)
      REAL WEIALL, SUMALL
      REAL VAMAX
      REAL VA(N_CTRDP1)
!      REAL VA_AG(N_CTRD_AG) !,VA_EVG(N_CTRD_EVG),VA_IVG(N_CTRD_IVG)
      REAL MASK(N_CTRDP1),L4IE !, L4IE(N_CTRD)

"!OMP TARGET" form seems to understand what these arguments really are, while "do concurrent" form fails.

 

Detailed error report:

------ BEGIN NUMERICAL INTEGRATION ------
-> 1
-> 2
-> 3
-> 4
-> 4.UV
-> INFO_EXCH_GPU.0
-> INFO_EXCH_GPU.1
forrtl: severe (157): Program Exception - access violation
Image PC Routine Line Source
ze_intel_gpu64.dl 00007FFA3F008B4A Unknown Unknown Unknown
ze_intel_gpu64.dl 00007FFA3ED6ACA6 Unknown Unknown Unknown
ze_intel_gpu64.dl 00007FFA3ED6B260 Unknown Unknown Unknown
ze_intel_gpu64.dl 00007FFA3ECF3B91 Unknown Unknown Unknown
ze_loader.dll 00007FFA438A6D45 Unknown Unknown Unknown
omptarget.rtl.lev 00007FF9FA8446CC Unknown Unknown Unknown
omptarget.dll 00007FF9FAB88B9B Unknown Unknown Unknown
omptarget.dll 00007FF9FABA2A4E Unknown Unknown Unknown
omptarget.dll 00007FF9FAB8E1F5 Unknown Unknown Unknown
UFDE.exe 00007FF69255C0AC Unknown Unknown Unknown
UFDE.exe 00007FF69261DDFA Unknown Unknown Unknown
UFDE.exe 00007FF69269906B Unknown Unknown Unknown
UFDE.exe 00007FF692732AA0 Unknown Unknown Unknown
KERNEL32.DLL 00007FFB166EE8D7 Unknown Unknown Unknown
ntdll.dll 00007FFB1777BF6C Unknown Unknown Unknown


I think it's a bug. 

 

Thanks!

 

0 Compliments
Répondre