<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic local type(c_ptr) in a multithreaded code in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056479#M116377</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I'm having a few issues with a type(c_ptr) variable declared inside a routine which is called by multiple threads at the same time. I managed to reproduce the problem in a small code with OpenMP (the actual code is multithreaded with pthreads instead). Take these two files:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;main.f90&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;program main

  !$omp parallel do
  do i=1, 4
     call sub
  end do
  !$omp end parallel do

  stop
end program main&lt;/PRE&gt;

&lt;P&gt;and &lt;STRONG&gt;sub.f90&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;subroutine sub
  use iso_c_binding
  use omp_lib
  
  type(c_ptr), target :: a
  integer, target :: b

  write(*,*)omp_get_thread_num(),c_loc(a),c_loc(b)

  return
end subroutine sub
&lt;/PRE&gt;

&lt;P&gt;If I compile like this&lt;/P&gt;

&lt;P&gt;ifort -c sub.f90; ifort -openmp sub.o main.f90&lt;/P&gt;

&lt;P&gt;and execute the program with 4 threads I get this result&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 139788471495016&lt;BR /&gt;
		&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 139788475693416&lt;BR /&gt;
		&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 139788467296616&lt;BR /&gt;
		&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 140734651561704&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;which shows that the address of the b variable is different for all threads (like it should be) but the address of a is, instead, the same. As far as I understand the OpenMP specifications, this behavior is not correct, am I right?&lt;/P&gt;

&lt;P&gt;If I compile the sub.f90 file with the -openmp flag, then the code works as expected.&lt;/P&gt;

&lt;P&gt;Everything works ok with gfortran (with and without the -fopenmp flag for compiling sub.f90).&lt;/P&gt;

&lt;P&gt;Am I missing something?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;alfredo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 07 Jul 2014 13:10:55 GMT</pubDate>
    <dc:creator>Alfredo</dc:creator>
    <dc:date>2014-07-07T13:10:55Z</dc:date>
    <item>
      <title>local type(c_ptr) in a multithreaded code</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056479#M116377</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I'm having a few issues with a type(c_ptr) variable declared inside a routine which is called by multiple threads at the same time. I managed to reproduce the problem in a small code with OpenMP (the actual code is multithreaded with pthreads instead). Take these two files:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;main.f90&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;program main

  !$omp parallel do
  do i=1, 4
     call sub
  end do
  !$omp end parallel do

  stop
end program main&lt;/PRE&gt;

&lt;P&gt;and &lt;STRONG&gt;sub.f90&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;subroutine sub
  use iso_c_binding
  use omp_lib
  
  type(c_ptr), target :: a
  integer, target :: b

  write(*,*)omp_get_thread_num(),c_loc(a),c_loc(b)

  return
end subroutine sub
&lt;/PRE&gt;

&lt;P&gt;If I compile like this&lt;/P&gt;

&lt;P&gt;ifort -c sub.f90; ifort -openmp sub.o main.f90&lt;/P&gt;

&lt;P&gt;and execute the program with 4 threads I get this result&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 139788471495016&lt;BR /&gt;
		&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 139788475693416&lt;BR /&gt;
		&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 139788467296616&lt;BR /&gt;
		&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 7014976 &amp;nbsp; &amp;nbsp; &amp;nbsp; 140734651561704&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;which shows that the address of the b variable is different for all threads (like it should be) but the address of a is, instead, the same. As far as I understand the OpenMP specifications, this behavior is not correct, am I right?&lt;/P&gt;

&lt;P&gt;If I compile the sub.f90 file with the -openmp flag, then the code works as expected.&lt;/P&gt;

&lt;P&gt;Everything works ok with gfortran (with and without the -fopenmp flag for compiling sub.f90).&lt;/P&gt;

&lt;P&gt;Am I missing something?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;alfredo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 07 Jul 2014 13:10:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056479#M116377</guid>
      <dc:creator>Alfredo</dc:creator>
      <dc:date>2014-07-07T13:10:55Z</dc:date>
    </item>
    <item>
      <title>This has very little to do</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056480#M116378</link>
      <description>&lt;P&gt;This has very little to do with the OpenMP standard.&amp;nbsp; Rather, this is an issue on whether AUTOMATIC or SAVE is set for local variables.&amp;nbsp; You can read the documentation on:&lt;/P&gt;

&lt;P&gt;-auto&amp;nbsp; and -auto-scalar&lt;/P&gt;

&lt;P&gt;-recursive&lt;/P&gt;

&lt;P&gt;-save&lt;/P&gt;

&lt;P&gt;auto-scalar applies to intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL ( hence B in your sample, but NOT A since it's not an intrinsic type).&amp;nbsp; To set AUTO for non-intrinsic types such as A, you need to use -auto, -recursive, or -openmp all of which set -auto.&lt;/P&gt;

&lt;P&gt;What other compilers do wrt auto vs save for non-intrinsic types is irrelevant.&amp;nbsp; Ours requires one of the 3 options listed above.&lt;/P&gt;

&lt;P&gt;Personally, I don't like to rely on compiler options.&amp;nbsp; I'd have used the RECURSIVE keyword on the subroutine declaration thusly:&lt;/P&gt;

&lt;P&gt;recursive subroutine sub&lt;/P&gt;

&lt;P&gt;ron&lt;/P&gt;</description>
      <pubDate>Mon, 07 Jul 2014 18:59:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056480#M116378</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2014-07-07T18:59:24Z</dc:date>
    </item>
    <item>
      <title>Ronald,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056481#M116379</link>
      <description>&lt;P&gt;Ronald,&lt;/P&gt;

&lt;P&gt;thanks for your very clear and detailed response. The OpenMP standard says that each thread should get a private copy of locally declared variables and this is probably the reason why the -openmp flag also implies -auto, right? For some reason I thought this should be the case even if the -openmp flag was not specified. I am rather scared by this thing because this essentially means that any routine working on non-intrinsic types is not thread-safe, correct? &amp;nbsp;&lt;/P&gt;

&lt;P&gt;Declaring routines recursive just for the purpose of making them thread-safe has more implications other than making all the variables automatic? will this make the code any slower or prevent the compiler from doing some optimizations?&lt;/P&gt;

&lt;P&gt;Kind regards,&lt;/P&gt;

&lt;P&gt;alfredo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 07 Jul 2014 20:00:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056481#M116379</guid>
      <dc:creator>Alfredo</dc:creator>
      <dc:date>2014-07-07T20:00:20Z</dc:date>
    </item>
    <item>
      <title>The effects of -auto and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056482#M116380</link>
      <description>&lt;P&gt;The effects of -auto and recursive procedure declarations should be effectively the same.&amp;nbsp; You could save .s files and compare, if you're curious.&lt;/P&gt;

&lt;P&gt;It's conceivable (more so in the long distant past) that a single thread application could run faster with local SAVEd arrays, but that won't work when threads need their own copies.&amp;nbsp; A case I can think of, local constant arrays, might better be written with PARAMETER arrays, which could save some time starting the procedure.&lt;/P&gt;

&lt;P&gt;Local scalars normally default to automatic when -save isn't set, which should improve ability to optimize.&amp;nbsp; For this reason, failing to specify private doesn't necessarily cause threads to step on each other.&amp;nbsp; Apparently, target may cause such unsafe code to fail.&lt;/P&gt;</description>
      <pubDate>Mon, 07 Jul 2014 21:30:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056482#M116380</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2014-07-07T21:30:25Z</dc:date>
    </item>
    <item>
      <title>Tim,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056483#M116381</link>
      <description>&lt;P&gt;Tim,&lt;/P&gt;

&lt;P&gt;thanks a lot for these details.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Kind regards,&lt;/P&gt;

&lt;P&gt;alfredo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Jul 2014 09:41:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/local-type-c-ptr-in-a-multithreaded-code/m-p/1056483#M116381</guid>
      <dc:creator>Alfredo</dc:creator>
      <dc:date>2014-07-08T09:41:37Z</dc:date>
    </item>
  </channel>
</rss>

