- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm updating some of my long-standing code to take advantage of the latest OpenMP 4.0 syntax, but I've come unstuck at the first attempt.
I decided to take a very simple, short subroutine that calculates a vector product: one of the vectors, 'normal' is unique, represented by its scalar components 'normal_x', 'normal_y' and 'normal_z', whilst the other vector 'rzero' is typically one of a large number of vectors represented by arrays of 'rzero_x', 'rzero_y' and 'rzero_z' components. Multiplying the normal by 'rzero' gives 'es', and taking the amplitude of 'es' gives us the sine of the angle between 'rzero' and the normal, viz:
elemental subroutine Calculate_Sines(normal_x, normal_y, normal_z, & rzero_x, rzero_y, rzero_z, & es_x, es_y, es_z, sin_theta) !$omp declare simd(Calculate_Sines) uniform(normal_x, normal_y, normal_z) real(kind(1d0)), intent(in) :: normal_x, normal_y, normal_z real(kind(1d0)), intent(in) :: rzero_x, rzero_y, rzero_z real(kind(1d0)), intent(out) :: es_x, es_y, es_z, sin_theta es_x = normal_z * rzero_y - normal_y * rzero_z es_y = normal_x * rzero_z - normal_z * rzero_x es_z = normal_y * rzero_x - normal_x * rzero_y sin_theta = sqrt( es_x**2 + es_y**2 + es_z**2 ) end subroutine Calculate_Sines
If I unit-test this by putting it into a program all on its own, then its works just as I'd expect: if I compile with /Qopt-report:2, then I am told:
15301: FUNCTION WAS VECTORIZED
...just as I'd hope. However, if I now take exactly the same code snippet and past it into the module where I actually want to use it (but as yet without having it actually invoked from anywhere else in the application), then compile with exactly the same relevant switches (/O2 /Qparallel /Qopenm /standard-semantics /Qopt-report:2) then I'm told:
warning #13397: vector function was not vectorized warning #13401: vector function was emulated
The compiler persistently refuses to give me any clues as to what may be preventing it from vectorising the function, even if I bump Qopt-report all the way up to 5.
So, what is going on here?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What version of the compiler are you using? From the Intel Fortran Compiler version 15.0 and onwards, it is sufficient to incorporate a SIMD-enabled procedure within a module. Then any calling routine that USE's the module has access to the interface and can see that a SIMD-enabled version of the function is available. If you are using 14.0, you'll need to upgrade for this to work.
>>>However, if I now take exactly the same code snippet and past it into the module where I actually want to use it (but as yet without having it actually invoked from anywhere else in the application), then compile with exactly the same relevant switches (/O2 /Qparallel /Qopenm
I'd did just that with 15.0.3. I'll assume /Qopenm is just a typo.
C:\ISN_Forums\U558125>ifort -c -Qopenmp U558125-module.f90 -Qopt-report=5 -Qopt-report-file=stdout -Qopt-report-phase=openmp,vec
Intel(R) Visual Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.3.208 Build 20150407
Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
Begin optimization report for: SINES::CALCULATE_SINES.._simdsimd3__xmm2nuuuvvvvvvv.J
Report from: Vector optimizations [vec]
remark #15301: FUNCTION WAS VECTORIZED [ C:\ISN_Forums\U558125\U558125-module.
f90(5,22) ]
===========================================================================
Begin optimization report for: SINES::CALCULATE_SINES.._simdsimd3__xmm2muuuvvvvvvv.J
Report from: Vector optimizations [vec]
remark #15301: FUNCTION WAS VECTORIZED [ C:\ISN_Forums\U558125\U558125-module.
f90(5,22) ]
===========================================================================
C:\ISN_Forums\U558125>ifort -Qopenmp U558125-mod-use.f90 U558125-module.obj
Intel(R) Visual Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.3.208 Build 20150407
Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
Microsoft (R) Incremental Linker Version 12.00.21005.1
Copyright (C) Microsoft Corporation. All rights reserved.
-out:U558125-mod-use.exe
-subsystem:console
-defaultlib:libiomp5md.lib
-nodefaultlib:vcomp.lib
-nodefaultlib:vcompd.lib
U558125-mod-use.obj
U558125-module.obj
C:\ISN_Forums\U558125>U558125-mod-use.exe
sin_theta= 0.000000000000000E+000
C:\ISN_Forums\U558125>
If the compiler version is not the issue, perhaps it's due to something in your module source inhibiting vectorization.
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. I hadn't realised that there was so much additional information in the .optrept file. The contents of that file sheds a little more light on things. In my module, the subroutine (including comments) occupies lines 1769 - 1786. Here is the relevant part of the .optrept file:
Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv.J Report from: Interprocedural optimizations [ipo] C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv.J) [20/31=64.5%] C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_ C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_ INLINE REPORT END Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par] C:\Users\..\FortranSource\Workspaces.F90(1769,26):remark #15301: FUNCTION WAS VECTORIZED <compiler generated>:remark #15399: vectorization support: unroll factor set to 2 =========================================================================== Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv Report from: Interprocedural optimizations [ipo] C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv) [20/31=64.5%] C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_ C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_ INLINE REPORT END =========================================================================== Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv.J Report from: Interprocedural optimizations [ipo] C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv.J) [20/31=64.5%] C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_ C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_ INLINE REPORT END Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par] C:\Users\..\FortranSource\Workspaces.F90(1769,26):remark #15326: function was not vectorized: implied FP exception model prevents vectorization. Consider changing compiler flags and/or directives in the source to enable fast FP model and to mask FP exceptions C:\Users\..\FortranSource\Workspaces.F90(1769,26):remark #13397: vector function was not vectorized =========================================================================== Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv Report from: Interprocedural optimizations [ipo] C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv) [20/31=64.5%] C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_ C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_ INLINE REPORT END =========================================================================== Begin optimization report for: WORKSPACES::CALCULATE_SINES Report from: Interprocedural optimizations [ipo] C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES) [20/31=64.5%] C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_ C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_ INLINE REPORT END ===========================================================================
I assume that the function is being compiled in different ways to accommodate the various ways it may be called (scalar, do-loop, index-range etc.) and that it is succeeding to vectorise it in some cases but not in others. In the case that fails, it cites the floating-point exception model as a stumbling block. I was compiling with /fp:source and /fp:except, so I tried turning off exceptions (/fp:except-) and sure enough the vectorisation succeeded.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Refer to this new post for some important updated discussion on the treatment of the elemental subroutine in the original post for this thread by the 16.0 compiler.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page