Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

maxval and maxloc in sequence

mpiuser
Beginner
1,412 Views
Hi,

I am developing a program that currently uses maxval(x) and maxloc(x) in sequence operating on the same large array x. This is looped many times with interdependence between loops.

I am wondering if either of the following two are possible:

(1) The compiler automatically (mpif90 -O3 -ipo ...) recognizes that it only needs to perform the search operations to find the maximum of x once and records both its value and location.

(2) There is some other function in the MKL or elsewhere that will do both of these tasks at once.

If (1) is not true then it seems I could cut my computation time by almost half if something like (2) exists!

many thanks,

Brant
0 Kudos
4 Replies
JVanB
Valued Contributor II
1,412 Views
Find maxloc(x) first. The optimal method for then finding maxval(x) is left as an exercise for the reader.
0 Kudos
mpiuser
Beginner
1,412 Views
duh ..... :)

Thanks!
0 Kudos
mecej4
Honored Contributor III
1,412 Views
More interesting questions

(a) finding both the minimum value and the maximum value

(b) finding both the mimimum absolute value and the maximum absolute value

For an array of size N, either of these can be performed in O (3N/2) comparisons rather than the 2N that would be used if each value were found by a separate search.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,412 Views
Quoting mecej4
More interesting questions

(a) finding both the minimum value and the maximum value

(b) finding both the mimimum absolute value and the maximum absolute value

For an array of size N, either of these can be performed in O (3N/2) comparisons rather than the 2N that would be used if each value were found by a separate search.


Depending on how well SSE/AVX is utilized, the number of comparisons can be reduced by a factor of 2, 4, 8

The (minimum value + index) and (maximum value + index) is a little more interesting in SSE. The same mask that is used on the values can also be used on a seperate SSE/AVX register containing the current multiple indicies of the values.

Jim Dempsey

0 Kudos
Reply