max and min reduction and other omp 4 changes in 15.0 updates

TimP · ‎04-28-2015

I've been looking for documentation of what has changed with this vector reduction. The one note I find in ifort release notes is about the somewhat curious addition of these reduction clauses to the legacy !dir$ simd directive in 15.0.1.

In the basic case, the reductions, using f77 code and directive, are equivalent to f90 maxval and minval, so the latter seem preferable, and I almost wonder why so much fuss has been made about directives which 15.0.3 seems to show aren't needed for the single thread case.

In 15.0.0, the omp simd reduction(max: directive seemed to be needed to optimize, even when /fp:fast is set. In 15.0.3, the omp simd directive appears to make no difference under /fp:fast (the default). Under the more conservative settings, e.g. /fp:source, the directive is needed to maintain vectorization (by over-riding the changed /fp: setting), but even without vectorization the ifort code seems fairly efficient. I'm wondering how the vectorized result could be different; in the case of ties, a different array element might be picked, but as it compares equal, it makes no difference.

Another change which I hadn't noticed is that while 15.0.0 claimed only OpenMP 3.1 compliance, 15.0.3 claims OpenMP 4.0, according to the date value embedded in _OPENMP. This surprised me, as it used to be said that full OpenMP 4.0 compliance wasn't planned in a 15.0 update. Before, it appeared to be necessary to use __INTEL_COMPILER >= 1500 to detect a compiler version which could use omp simd reduction(max:

The net result is that the 15.0 updates have improved the opportunities to see vector optimized max and min reductions

In 15.0.3, !$omp simd still doesn't work as !dir$ simd does to suppress generation of a temporary array, so the legacy directive has to be used for that purpose in ifort as well as for the cases involving firstprivate (which isn't part of OpenMP 4). So, in order to make optimized code which is portable, e.g. between ifort and gfortran, we still need to use conditional compilation to invoke !dir$ simd for ifort and !$omp simd for e.g. gfortran (in the smaller number of cases where gfortran will generate an undesired temporary). I've never been able to get an explanation why ifort couldn't interpret this the same as other compilers.

Steven_L_Intel1 · ‎04-29-2015

Hi, Tim...

I will pass your questions on to others here who might be able to explain what changed. I didn't expect the __OPENMP version to change - we don't yet support ALL of OpenMP 4.0, as you probably know - we're missing user-defined reductions (not the least because these are still ill-defined for Fortran.)

TimP · ‎04-29-2015

Thanks for the comments.

From a traditional user point of view, I suppose it's hard to guess whether user defined reduction could be counted on as an improvement. Possible cases I have in mind may be handled with omp parallel reduction in outer loop and omp simd reduction or Fortran reduction intrinsic in inner loop, and perhaps it's more straightforward that way.

If the comment about "ill-defined for Fortran" refers to the choice between array operations and f77 code which works with omp, similar questions arise in C++ and the announced effort to combine part of OpenMP 4 with Cilk(tm) Plus. If omp workshare is relevant in this respect, even with the beginning efforts made to implement it in current ifort, it doesn't match performance of omp parallel do in my examples.

Steven_L_Intel1 · ‎04-29-2015

By ill-defined I mean that the syntax isn't well described and there are no examples.