Re: Implementation of ANY()

Dishaw__Jim · ‎08-20-2006

I have the following elemental function

! Compute symmetric relative difference between two values
ELEMENTAL FUNCTION SRD(x,y)
REAL(dp), INTENT(IN) :: x
REAL(dp), INTENT(IN) :: y
REAL(dp) :: SRD

IF (x == 0._dp .AND. y == 0._dp) THEN
SRD = 0._dp
ELSE
SRD = ABS(x-y) / ((ABS(x) + ABS(y)) /2._dp)
ENDIF
END FUNCTION SRD

and I have the following code

IF(ANY(SRD(previous,current) .GT. tol) THEN
...
END IF

If current and previous are arrays, it appears that the Intel compiler executes SRD over the entire array and then performs ANY. I was hoping that since SRD is elemental, the compiler would not call SRD for every element when used in conjunction with ANY. Can the Intel compiler optimize this operation such that it only evaluates SRD until a pair is greater than tol? If so, what compiler option causes that to happen?

TimP · ‎08-20-2006

If you know that a sequential search, taking the array elements in an order known to you, but not to the compiler, will accomplish the job faster, you should use a DO loop. The syntax of ANY() encourages the compiler to parallelize, taking values in an order which evaluates the entire array as quickly as possible.
If you are looking for efficiency, and don't have a requirement to perform division, that is the first thing you should change.
In principle, it might be possible to code in such a way that the search is done in parallel batches, stopping the loop as soon as a positive is found in one of the batches. Compiler optimizers typically aren't suited to that sort of organization.

grg99 · ‎08-21-2006

You could eliminate half the expensive divisions by factoring out the division by two.. i.e.

change the function to not divide the result by two,
then change the outer IF to: if( ... .GT. TOL * 2.0 )

( I'm assuming the compiler will evaluate TOL * 2.0 just once, if not factor that out too)

Also there's no need to check for the numerator being zero, if there are few zeroes it will be faster to just not do this check.)

And as others have noted, it will be a LOT faster if you just do the loop yourself and stop as needed.

Also if there's any array trend carrying over from call to call, it might be hugely faster to first check the x,y element that met the criteria on the previous call!( and/or the surrounding elements. ) Use every bit of info you have about your data!

jimdempseyatthecove · ‎08-21-2006

I agree with grg99. Create a logical function that takes in the two arrays and the tolarance. Remove the division in the loop. Ifthe arrays are large consider optimizing for OpenMP. Use the OpenMP FLUSH on a shared return variable to cause early termination of search loop. Also consider grg99's suggestion about knowledge of the data. Sequential search might not be the best solution for certain known conditions.

Example. Assume your application the ANY(SRD... in a feedback situation whereby you make state corrections. If aftera cell is brought in tollerance it tends to stay in tollarence you might want to consider a ring buffer test scheme. i.e. you start the search after the last remembered intollerance. This will tend to find the next intollarance quicker.

Something for you to think about.

Jim Dempsey