evaluate the code here please

aketh_t_ · ‎06-19-2015

hi all

I have a question to ask here

which of the below two codes do better on vectorization.

do i=1,n
    if(LMASK(i))
    
    a(i)=b(i)*c(i)

    endif       
enddo

integer LMASKtemp = abs(LMASK)

do i=1,n
    
    a(i) = a(i)*(1-LMASKtemp) + LMASKtemp * b(i)*c(i) 
       
enddo

jimdempseyatthecove · ‎06-19-2015

Try: WHERE(LMASK) A=B*C

Jim Dempsey

TimP · ‎06-19-2015

First, the 2 cases aren't nearly equivalent. A loop invariant conditional like your second version doesn't need any major effort, let alone non-portable stuff, to optimize. So I'll talk the case where the condition is not loop invariant.

As you wrote it, the first case would need a simd or vector always directive to vectorize, to give the compiler permission to dispense with ability to capture floating point exceptions. Without the directive, the vectorizer would decline on the grounds of "protects exception."

Your second case is reminiscent of Fortran 77, where one might write an expression using the SIGN intrinsic to calculate your LMASKtemp, if it were not loop invariant. It should not require directives for vectorization even in the non-invariant case.

a = merge(0., a, LMASK) + merge( b, 0., LMASK) * c

might be preferable in my opinion, as it should vectorize with various known compilers, without a directive, and takes full advantage of AVX2 instructions. Of course, there are cases involving non-finite operands where it's not equivalent to your version. For example, if c had an infinity corresponding with a .false. mask, the result would be NaN. That's the price paid for writing an almost similar version which avoids "protects exception."

The methods involving lots of arithmetic may be as efficient as the others on Intel instruction sets prior to the introduction of blend.