Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Implementation of ANY()

Dishaw__Jim
Beginner
825 Views
I have the following elemental function
! Compute symmetric relative difference between two values
ELEMENTAL FUNCTION SRD(x,y)
REAL(dp), INTENT(IN) :: x
REAL(dp), INTENT(IN) :: y
REAL(dp) :: SRD

IF (x == 0._dp .AND. y == 0._dp) THEN
SRD = 0._dp
ELSE
SRD = ABS(x-y) / ((ABS(x) + ABS(y)) /2._dp)
ENDIF
END FUNCTION SRD
and I have the following code
IF(ANY(SRD(previous,current) .GT. tol) THEN
...
END IF
If current and previous are arrays, it appears that the Intel compiler executes SRD over the entire array and then performs ANY. I was hoping that since SRD is elemental, the compiler would not call SRD for every element when used in conjunction with ANY. Can the Intel compiler optimize this operation such that it only evaluates SRD until a pair is greater than tol? If so, what compiler option causes that to happen?
0 Kudos
3 Replies
TimP
Honored Contributor III
825 Views
If you know that a sequential search, taking the array elements in an order known to you, but not to the compiler, will accomplish the job faster, you should use a DO loop. The syntax of ANY() encourages the compiler to parallelize, taking values in an order which evaluates the entire array as quickly as possible.
If you are looking for efficiency, and don't have a requirement to perform division, that is the first thing you should change.
In principle, it might be possible to code in such a way that the search is done in parallel batches, stopping the loop as soon as a positive is found in one of the batches. Compiler optimizers typically aren't suited to that sort of organization.
0 Kudos
grg99
Beginner
825 Views
You could eliminate half the expensive divisions by factoring out the division by two.. i.e.

change the function to not divide the result by two,
then change the outer IF to: if( ... .GT. TOL * 2.0 )

( I'm assuming the compiler will evaluate TOL * 2.0 just once, if not factor that out too)

Also there's no need to check for the numerator being zero, if there are few zeroes it will be faster to just not do this check.)

And as others have noted, it will be a LOT faster if you just do the loop yourself and stop as needed.

Also if there's any array trend carrying over from call to call, it might be hugely faster to first check the x,y element that met the criteria on the previous call!( and/or the surrounding elements. ) Use every bit of info you have about your data!










0 Kudos
jimdempseyatthecove
Honored Contributor III
825 Views

I agree with grg99. Create a logical function that takes in the two arrays and the tolarance. Remove the division in the loop. Ifthe arrays are large consider optimizing for OpenMP. Use the OpenMP FLUSH on a shared return variable to cause early termination of search loop. Also consider grg99's suggestion about knowledge of the data. Sequential search might not be the best solution for certain known conditions.

Example. Assume your application the ANY(SRD... in a feedback situation whereby you make state corrections. If aftera cell is brought in tollerance it tends to stay in tollarence you might want to consider a ring buffer test scheme. i.e. you start the search after the last remembered intollerance. This will tend to find the next intollarance quicker.

Something for you to think about.

Jim Dempsey

0 Kudos
Reply