Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

local type(c_ptr) in a multithreaded code

Alfredo
Beginner
569 Views

Hello,

I'm having a few issues with a type(c_ptr) variable declared inside a routine which is called by multiple threads at the same time. I managed to reproduce the problem in a small code with OpenMP (the actual code is multithreaded with pthreads instead). Take these two files:

main.f90

program main

  !$omp parallel do
  do i=1, 4
     call sub
  end do
  !$omp end parallel do

  stop
end program main

and sub.f90

subroutine sub
  use iso_c_binding
  use omp_lib
  
  type(c_ptr), target :: a
  integer, target :: b

  write(*,*)omp_get_thread_num(),c_loc(a),c_loc(b)

  return
end subroutine sub

If I compile like this

ifort -c sub.f90; ifort -openmp sub.o main.f90

and execute the program with 4 threads I get this result

           2               7014976       139788471495016
           1               7014976       139788475693416
           3               7014976       139788467296616
           0               7014976       140734651561704

which shows that the address of the b variable is different for all threads (like it should be) but the address of a is, instead, the same. As far as I understand the OpenMP specifications, this behavior is not correct, am I right?

If I compile the sub.f90 file with the -openmp flag, then the code works as expected.

Everything works ok with gfortran (with and without the -fopenmp flag for compiling sub.f90).

Am I missing something?

Thanks,

alfredo

 

 

 

 

 

 

 

 

 

0 Kudos
1 Solution
Ron_Green
Moderator
569 Views

This has very little to do with the OpenMP standard.  Rather, this is an issue on whether AUTOMATIC or SAVE is set for local variables.  You can read the documentation on:

-auto  and -auto-scalar

-recursive

-save

auto-scalar applies to intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL ( hence B in your sample, but NOT A since it's not an intrinsic type).  To set AUTO for non-intrinsic types such as A, you need to use -auto, -recursive, or -openmp all of which set -auto.

What other compilers do wrt auto vs save for non-intrinsic types is irrelevant.  Ours requires one of the 3 options listed above.

Personally, I don't like to rely on compiler options.  I'd have used the RECURSIVE keyword on the subroutine declaration thusly:

recursive subroutine sub

ron

View solution in original post

0 Kudos
4 Replies
Ron_Green
Moderator
570 Views

This has very little to do with the OpenMP standard.  Rather, this is an issue on whether AUTOMATIC or SAVE is set for local variables.  You can read the documentation on:

-auto  and -auto-scalar

-recursive

-save

auto-scalar applies to intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL ( hence B in your sample, but NOT A since it's not an intrinsic type).  To set AUTO for non-intrinsic types such as A, you need to use -auto, -recursive, or -openmp all of which set -auto.

What other compilers do wrt auto vs save for non-intrinsic types is irrelevant.  Ours requires one of the 3 options listed above.

Personally, I don't like to rely on compiler options.  I'd have used the RECURSIVE keyword on the subroutine declaration thusly:

recursive subroutine sub

ron

0 Kudos
Alfredo
Beginner
569 Views

Ronald,

thanks for your very clear and detailed response. The OpenMP standard says that each thread should get a private copy of locally declared variables and this is probably the reason why the -openmp flag also implies -auto, right? For some reason I thought this should be the case even if the -openmp flag was not specified. I am rather scared by this thing because this essentially means that any routine working on non-intrinsic types is not thread-safe, correct?  

Declaring routines recursive just for the purpose of making them thread-safe has more implications other than making all the variables automatic? will this make the code any slower or prevent the compiler from doing some optimizations?

Kind regards,

alfredo

 

0 Kudos
TimP
Honored Contributor III
569 Views

The effects of -auto and recursive procedure declarations should be effectively the same.  You could save .s files and compare, if you're curious.

It's conceivable (more so in the long distant past) that a single thread application could run faster with local SAVEd arrays, but that won't work when threads need their own copies.  A case I can think of, local constant arrays, might better be written with PARAMETER arrays, which could save some time starting the procedure.

Local scalars normally default to automatic when -save isn't set, which should improve ability to optimize.  For this reason, failing to specify private doesn't necessarily cause threads to step on each other.  Apparently, target may cause such unsafe code to fail.

0 Kudos
Alfredo
Beginner
569 Views

Tim,

thanks a lot for these details.

 

Kind regards,

alfredo

 

0 Kudos
Reply