OpenMP and IPO

briardew · ‎03-09-2012

I'm using MacOS 10.6.8 and Intel Fortran 11.1-046. I have a code that looks like

real aa(MM,NITS)

!$omp parallel do

do it = 1,NITS

call mysub( aa(:,it) )

end do

!$omp end parallel do

It works great if NITS > 1 and I compile with -openmp -ipo. Also works great with NITS = 1 and just -openmp. Gives me nonsense for NITS = 1 and -openmp -ipo. I suspect this is because IPO is removing the do loop and maybe OpenMP is trying to parallelize something in mysub. Is this right? If so, how do I stop it from happening?

jimdempseyatthecove · ‎03-09-2012

Is NITS an integer parameter
Or an integer variable
If variable, can the compiler optimizer determine its value?

Jim Dempsey

briardew · ‎03-09-2012

It's a parameter.

jimdempseyatthecove · ‎03-10-2012

>>It's a parameter.

If you can get a simple reproducer then send it in.

This would be a problem with the optimizer.
Two work arounds come to mind:

a) When the interation count is .lt. x (you decide what x is) then bypass the OpenMP parallel loop and run the loop directly. Note, there is an OpenMP clause to do this too, you can experiment with that too.

b) Make the iteration limit an unknown variable to the compiler (including IPO where ie snoops around other files).

Jim Dempsey

briardew · ‎07-13-2012

Thanks for the response and sorry for the long reply time.

I managed to reduce my problem down to something simpler. I can reproduce this w/ 11.1 and composer 2011 SP1 11.344. Here is the code:

program omptest
  use omp_lib

  integer, parameter :: nits = 1
  integer, parameter :: ldx  = 10

  integer :: it
  real    :: xxen(ldx,nits), xx(ldx)

  xx = 1.0

!$omp parallel
!$omp single
  print *, 'Using OpenMP with ', omp_get_num_threads(), ' threads.'
  print *, '---'
!$omp end single
!$omp end parallel

!$omp parallel do
  do it = 1,nits
    print *, 'it = ',it
    xxen(:,it) = xx(:)
  end do
!$omp end parallel do

  print *, 'xxen = ',xxen
end program

If I compile with -fast and -openmp, it will print the iteration number N times, where N is the number of threads. If I take out the -ipo option (but keep -O3 -xHost, etc.), then I get the printed line once, as I should. Also, if I compile with -fast and -openmp and nits greater than 1, I get correct behavior. Finally, if xx is a scalar instead of an array, everything works correctly.

Thanks,
Brad