Community
cancel
Showing results for
Did you mean:
Beginner
110 Views

## A question about $omp declare simd I'm updating some of my long-standing code to take advantage of the latest OpenMP 4.0 syntax, but I've come unstuck at the first attempt. I decided to take a very simple, short subroutine that calculates a vector product: one of the vectors, 'normal' is unique, represented by its scalar components 'normal_x', 'normal_y' and 'normal_z', whilst the other vector 'rzero' is typically one of a large number of vectors represented by arrays of 'rzero_x', 'rzero_y' and 'rzero_z' components. Multiplying the normal by 'rzero' gives 'es', and taking the amplitude of 'es' gives us the sine of the angle between 'rzero' and the normal, viz:  elemental subroutine Calculate_Sines(normal_x, normal_y, normal_z, & rzero_x, rzero_y, rzero_z, & es_x, es_y, es_z, sin_theta) !$omp declare simd(Calculate_Sines) uniform(normal_x, normal_y, normal_z)
real(kind(1d0)), intent(in) :: normal_x, normal_y, normal_z
real(kind(1d0)), intent(in) :: rzero_x, rzero_y, rzero_z
real(kind(1d0)), intent(out) :: es_x, es_y, es_z, sin_theta

es_x = normal_z * rzero_y - normal_y * rzero_z
es_y = normal_x * rzero_z - normal_z * rzero_x
es_z = normal_y * rzero_x - normal_x * rzero_y

sin_theta = sqrt( es_x**2 + es_y**2 + es_z**2 )

end subroutine Calculate_Sines 

If I unit-test this by putting it into a program all on its own, then its works just as I'd expect: if I compile with /Qopt-report:2, then I am told:

    15301: FUNCTION WAS VECTORIZED

...just as I'd hope. However, if I now take exactly the same code snippet and past it into the module where I actually want to use it (but as yet without having it actually invoked from anywhere else in the application), then compile with exactly the same relevant switches (/O2 /Qparallel /Qopenm /standard-semantics /Qopt-report:2) then I'm told:

    warning #13397: vector function was not vectorized
warning #13401: vector function was emulated

The compiler persistently refuses to give me any clues as to what may be preventing it from vectorising the function, even if I bump Qopt-report all the way up to 5.

So, what is going on here?

Tags (1)
3 Replies
Employee
110 Views

What version of the compiler are  you using?  From the Intel Fortran Compiler version 15.0 and onwards, it is sufficient to incorporate a SIMD-enabled procedure within a module. Then any calling routine that USE's the module has access to the interface and can see that a SIMD-enabled version of the function is available.  If you are using 14.0, you'll need to upgrade for this to work.

>>>However, if I now take exactly the same code snippet and past it into the module where I actually want to use it (but as yet without having it actually invoked from anywhere else in the application), then compile with exactly the same relevant switches (/O2 /Qparallel /Qopenm

I'd did just that with 15.0.3.  I'll assume /Qopenm is just a typo.

C:\ISN_Forums\U558125>ifort -c -Qopenmp U558125-module.f90 -Qopt-report=5 -Qopt-report-file=stdout -Qopt-report-phase=openmp,vec
Intel(R) Visual Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.3.208 Build 20150407

Begin optimization report for: SINES::CALCULATE_SINES.._simdsimd3__xmm2nuuuvvvvvvv.J

Report from: Vector optimizations [vec]

remark #15301: FUNCTION WAS VECTORIZED   [ C:\ISN_Forums\U558125\U558125-module.
f90(5,22) ]
===========================================================================

Begin optimization report for: SINES::CALCULATE_SINES.._simdsimd3__xmm2muuuvvvvvvv.J

Report from: Vector optimizations [vec]

remark #15301: FUNCTION WAS VECTORIZED   [ C:\ISN_Forums\U558125\U558125-module.
f90(5,22) ]
===========================================================================

C:\ISN_Forums\U558125>ifort  -Qopenmp U558125-mod-use.f90  U558125-module.obj
Intel(R) Visual Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.3.208 Build 20150407

Microsoft (R) Incremental Linker Version 12.00.21005.1

-out:U558125-mod-use.exe
-subsystem:console
-defaultlib:libiomp5md.lib
-nodefaultlib:vcomp.lib
-nodefaultlib:vcompd.lib
U558125-mod-use.obj
U558125-module.obj

C:\ISN_Forums\U558125>U558125-mod-use.exe
sin_theta=  0.000000000000000E+000

C:\ISN_Forums\U558125>

If the compiler version is not the issue, perhaps it's due to something in your module source inhibiting vectorization.

Patrick

Beginner
110 Views

Thank you. I hadn't realised that there was so much additional information in the .optrept file. The contents of that file sheds a little more light on things. In my module, the subroutine (including comments) occupies lines 1769 - 1786. Here is the relevant part of the .optrept file:

Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv.J

Report from: Interprocedural optimizations [ipo]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv.J) [20/31=64.5%]
C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_
C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_
INLINE REPORT END

Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):remark #15301: FUNCTION WAS VECTORIZED
<compiler generated>:remark #15399: vectorization support: unroll factor set to 2
===========================================================================

Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv

Report from: Interprocedural optimizations [ipo]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4nuuuvvvvvvv) [20/31=64.5%]
C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_
C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_
INLINE REPORT END
===========================================================================

Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv.J

Report from: Interprocedural optimizations [ipo]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv.J) [20/31=64.5%]
C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_
C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_
INLINE REPORT END

Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):remark #15326: function was not vectorized: implied FP exception model prevents vectorization. Consider changing compiler flags and/or directives in the source to enable fast FP model and to mask FP exceptions
C:\Users\..\FortranSource\Workspaces.F90(1769,26):remark #13397: vector function was not vectorized
===========================================================================

Begin optimization report for: WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv

Report from: Interprocedural optimizations [ipo]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES.._simdsimd3__xmm4muuuvvvvvvv) [20/31=64.5%]
C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_
C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_
INLINE REPORT END
===========================================================================

Begin optimization report for: WORKSPACES::CALCULATE_SINES

Report from: Interprocedural optimizations [ipo]

C:\Users\..\FortranSource\Workspaces.F90(1769,26):INLINE REPORT START:(WORKSPACES::CALCULATE_SINES) [20/31=64.5%]
C:\Users\..\FortranSource\Workspaces.F90(1769,26):(1)-> EXTERN: ___for_ieee_store_env_
C:\Users\..\FortranSource\Workspaces.F90(1786,5):(1)-> EXTERN: ___for_ieee_restore_env_
INLINE REPORT END
===========================================================================

I assume that the function is being compiled in different ways to accommodate the various ways it may be called (scalar, do-loop, index-range etc.) and that it is succeeding to vectorise it in some cases but not in others. In the case that fails, it cites the floating-point exception model as a stumbling block. I was compiling with /fp:source and /fp:except, so I tried turning off exceptions (/fp:except-) and sure enough the vectorisation succeeded.

Employee
110 Views

Refer to this new post for some important updated discussion on the treatment of the elemental subroutine in the original post for this thread by the 16.0 compiler.