<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SEGFAULT with OpenMP. Stack Problem? in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753258#M9038</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
You didn't specify your compiler options. &lt;BR /&gt;&lt;BR /&gt;Try adding this before the outter loop:&lt;BR /&gt;&lt;BR /&gt;cDEC$ NOVECTOR&lt;BR /&gt;&lt;BR /&gt;to disable vectorization on the loop nest. Does -vec-report indicate that any of the loops in question are vectorized?&lt;BR /&gt;&lt;BR /&gt;I notice the RHS and LHS in the assignment statements overlap, so some really large array temporaries may be being used in the vectorized case.&lt;BR /&gt;&lt;BR /&gt;ron&lt;BR /&gt;</description>
    <pubDate>Wed, 25 Feb 2009 19:48:57 GMT</pubDate>
    <dc:creator>Ron_Green</dc:creator>
    <dc:date>2009-02-25T19:48:57Z</dc:date>
    <item>
      <title>SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753255#M9035</link>
      <description>Hello All,&lt;BR /&gt;&lt;BR /&gt;I have what it appears to be a common segfault problem with openmp, but the&lt;BR /&gt;it seems that the ulimit and KMP_STACKSIZE (and the variants i've seen based &lt;BR /&gt;on these) do not have an effect. I wonder if anyone has a suggestion about this.&lt;BR /&gt;&lt;BR /&gt;Background: System Q9650 with 8GB running OpenSuse 11.1 64. Compiler: ifort 11.0.&lt;BR /&gt;Code: Fortran 77 written originally to run on cray (ymp and later) and being modified&lt;BR /&gt;to run with openmp on intel procs.&lt;BR /&gt;&lt;BR /&gt;Problem: Segfaults when the memory use increases. Web search remedies appear&lt;BR /&gt;ineffective.&lt;BR /&gt;&lt;BR /&gt;The code below is a minimal version that produces the problem. It was identified and&lt;BR /&gt;extracted from the program.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[plain]      Common / WORK / W1(20,20,21,21),W2(20,20,21,21),W3(20,20,21,21)&lt;BR /&gt;&lt;BR /&gt;....&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;c  NOTE:   N1= 64   N2=N3=21     M2=M3=10&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;!$OMP  PARALLEL DEFAULT(SHARED)&lt;BR /&gt;!$OMP+ PRIVATE( Iw, JEL, KEL, JGLL, KGLL, W1, W2, W3 )&lt;BR /&gt;!$OMP DO&lt;BR /&gt;      Do Iw = 1, N1&lt;BR /&gt;c&lt;BR /&gt;c - Interface condition of the PRESSURE GRADIENT.&lt;BR /&gt;c&lt;BR /&gt;      Do KEL  = 2, N3&lt;BR /&gt;      Do JEL  = 1, N2&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      W1(JGLL,1,JEL,KEL) = W1(JGLL,1,JEL,KEL) + W1(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;      W2(JGLL,1,JEL,KEL) = W2(JGLL,1,JEL,KEL) + W2(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;      W3(JGLL,1,JEL,KEL) = W3(JGLL,1,JEL,KEL) + W3(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;!$OMP END DO&lt;BR /&gt;!$OMP END PARALLEL&lt;BR /&gt;[/plain]&lt;/PRE&gt;
&lt;BR /&gt;&lt;BR /&gt;Any ideas what may be the matter?&lt;BR /&gt;&lt;BR /&gt;Thank you.&lt;BR /&gt;--&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 11:27:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753255#M9035</guid>
      <dc:creator>arktos</dc:creator>
      <dc:date>2009-02-25T11:27:42Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753256#M9036</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;One of the problems I see is you are making PRIVATE arrays contained in COMMON (very large arrays too).&lt;BR /&gt;&lt;BR /&gt;Try something like this:&lt;BR /&gt;
&lt;PRE&gt;[cpp]! in module
type    TypeThreadContext
  SEQUENCE
  REAL, pointer :: W1(:,:,:,:)
  REAL, pointer :: W2(:,:,:,:)
  REAL, pointer :: W3(:,:,:,:)
end type TypeThreadContext

type(TypeThreadContext) :: ThreadContext
COMMON /CONTEXT/ ThreadContext
!$OMP THREADPRIVATE(/CONTEXT/)
-----------------------------
! in initialization code once only
!$OMP PARALLEL
  if(.not. associated(W1)) allocate(W1(20,20,21,21))
  if(.not. associated(W2)) allocate(W2(20,20,21,21))
  if(.not. associated(W3)) allocate(W3(20,20,21,21))
  if(.not. associated(W4)) allocate(W4(20,20,21,21))
!$OMP END PARALLEL
---------------------------

Don't forget to deallocate on finish up code

I think you can use allocatable in the threadprivate area
At the time I wrote this the compiler would accept pointers but not allocatables
[/cpp]&lt;/PRE&gt;
&lt;BR /&gt;Jim Dempsey&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 16:47:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753256#M9036</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-02-25T16:47:02Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753257#M9037</link>
      <description>&lt;BR /&gt;&lt;BR /&gt;Thanks Jim. I will try this and see how it goes.&lt;BR /&gt;--&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 19:12:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753257#M9037</guid>
      <dc:creator>arktos</dc:creator>
      <dc:date>2009-02-25T19:12:54Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753258#M9038</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
You didn't specify your compiler options. &lt;BR /&gt;&lt;BR /&gt;Try adding this before the outter loop:&lt;BR /&gt;&lt;BR /&gt;cDEC$ NOVECTOR&lt;BR /&gt;&lt;BR /&gt;to disable vectorization on the loop nest. Does -vec-report indicate that any of the loops in question are vectorized?&lt;BR /&gt;&lt;BR /&gt;I notice the RHS and LHS in the assignment statements overlap, so some really large array temporaries may be being used in the vectorized case.&lt;BR /&gt;&lt;BR /&gt;ron&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 19:48:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753258#M9038</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2009-02-25T19:48:57Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753259#M9039</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/415968"&gt;arktos&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;&lt;BR /&gt;Thanks Jim. I will try this and see how it goes.&lt;BR /&gt;--&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;One other point to make.&lt;BR /&gt;&lt;BR /&gt;If you are using nested parallel regions you will have to init the W1, W2,... allocations in those theads as well.&lt;BR /&gt;It might not hurt to make the allocation a subroutine and insert into the code that uses W1, ...&lt;BR /&gt;&lt;BR /&gt;subroutine foo(...&lt;BR /&gt; ! near top&lt;BR /&gt; if(.not. associated(W1)) call InitW1234() ! some code like this&lt;BR /&gt; do i = 1,yourBigLoop&lt;BR /&gt; ... ! code using W1, ...&lt;BR /&gt;end do&lt;BR /&gt; ! leave W1,W2,... allocated for next time&lt;BR /&gt;end subroutine foo&lt;BR /&gt;&lt;BR /&gt;The if(.not.associated is a light weight test (one integerword being tested)&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 21:55:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753259#M9039</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-02-25T21:55:07Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753260#M9040</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Also,&lt;BR /&gt;&lt;BR /&gt;Rudra (frequent poster) sent me some code that exhibited a stack allocation problem with OpenMP on my system (Windows XP x64), he did not report this as a problem Linux x?? but it may have been a latent problem waiting to happen.&lt;BR /&gt;&lt;BR /&gt;This problem occured in the starup code. i.e. the error would occure _prior_ to reaching the 1st statement in main. Couldn't fix with futzing with OMP_STACKSIZE. Some sort of compiler problem. A rearrangement of code got it working.&lt;BR /&gt;&lt;BR /&gt;Is your problem occuring before you can reach the 1st statement in your program?&lt;BR /&gt;I may have some suggestions to work around that if that is your problem.&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey</description>
      <pubDate>Wed, 25 Feb 2009 22:01:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753260#M9040</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-02-25T22:01:10Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753261#M9041</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/160574"&gt;Ronald Green (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; I notice the RHS and LHS in the assignment statements overlap, so some really large array temporaries may be being used in the vectorized case.&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
I don't see how there can be an overlap, unless one of N3,N2,M2,M3 were out of bounds. Would the compilation behave better if all were compile time constants (e.g. PARAMETER)? You certainly don't want array temporaries in a case like this. If M2 is only 10, and the compiler vectorizes while optimizing for 20, but has to take it as variable, vectorization probably slows it down even without temporaries, so NO VECTOR is a reasonable choice.&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 22:10:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753261#M9041</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-02-25T22:10:32Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753262#M9042</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;&lt;BR /&gt;&lt;BR /&gt;Just to add some answers to your questions.&lt;BR /&gt;Missed them last night (was 1 or 2am here).&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;@jim&lt;BR /&gt;&lt;BR /&gt;The program has executed quite a few subroutines before reaching&lt;BR /&gt;the subroutine with the problem we are discussing. One of&lt;BR /&gt;the preceeding subroutines has be multithreaded with no&lt;BR /&gt;problem (but it does not use w1, w2 &amp;amp; w3 ).&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;@ron&lt;BR /&gt;&lt;BR /&gt;ifort -c -O3 -fpp -openmp -parallel &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;--------------------------------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Thank you all.&lt;BR /&gt;&lt;BR /&gt;I should have mentioned that all reals are *8.&lt;BR /&gt;&lt;BR /&gt;I have tried the following with exactly the same parameters as before:&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[plain]!$OMP  PARALLEL DEFAULT(SHARED)&lt;BR /&gt;!$OMP+ PRIVATE( Iw,JEL,KEL,JGLL,KGLL, W1, W2, W3 , T1 )&lt;BR /&gt;&lt;BR /&gt;!$OMP DO&lt;BR /&gt;      Do Iw = 1, N1&lt;BR /&gt;c&lt;BR /&gt;c - Interface condition of the PRESSURE GRADIENT.&lt;BR /&gt;c&lt;BR /&gt;      Do KEL  = 2, N3&lt;BR /&gt;      Do JEL  = 1, N2&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      T1 =  DFLOAT( JGLL)&lt;BR /&gt;      W1(JGLL,1,JEL,KEL) = 3.0d0*T1&lt;BR /&gt;      W2(JGLL,1,JEL,KEL) = 2.6565d0*T1*T1&lt;BR /&gt;      W3(JGLL,1,JEL,KEL) = 3.876d0*T1&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;!$OMP END DO&lt;BR /&gt;!$OMP END PARALLEL&lt;BR /&gt;[/plain]&lt;/PRE&gt;
&lt;BR /&gt;This segfaults&lt;BR /&gt;&lt;BR /&gt;I have also tried:&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[cpp]c&lt;BR /&gt;      Do KEL  = 2, N3&lt;BR /&gt;      Do JEL  = 1, N2&lt;BR /&gt;cDEC$ NOVECTOR&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      T1 =  DFLOAT( JGLL)&lt;BR /&gt;      W1(JGLL,1,JEL,KEL) = 3.0d0*T1&lt;BR /&gt;      W2(JGLL,1,JEL,KEL) = 2.6565d0*T1*T1&lt;BR /&gt;      W3(JGLL,1,JEL,KEL) = 3.876d0*T1&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;and &lt;BR /&gt;&lt;BR /&gt;c&lt;BR /&gt;cDEC$ NOVECTOR&lt;BR /&gt;      Do KEL  = 2, N3&lt;BR /&gt;      Do JEL  = 1, N2&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      T1 =  DFLOAT( JGLL)&lt;BR /&gt;      W1(JGLL,1,JEL,KEL) = 3.0d0*T1&lt;BR /&gt;      W2(JGLL,1,JEL,KEL) = 2.6565d0*T1*T1&lt;BR /&gt;      W3(JGLL,1,JEL,KEL) = 3.876d0*T1&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;[/cpp]&lt;/PRE&gt;
&lt;BR /&gt;Both segfault.&lt;BR /&gt;&lt;BR /&gt;I have yet to try Jim's suggestions.&lt;BR /&gt;&lt;BR /&gt;I would like to increase the size of the arrays still farther (so that I can increase my simulation&lt;BR /&gt;Reynolds number). So,I am looking for something I have proper control over.&lt;BR /&gt;&lt;BR /&gt;--&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Feb 2009 23:23:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753262#M9042</guid>
      <dc:creator>arktos</dc:creator>
      <dc:date>2009-02-25T23:23:48Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753263#M9043</link>
      <description>&lt;BR /&gt;OK.&lt;BR /&gt;&lt;BR /&gt;I have used the suggestion made by Jim and the segfault seems to have gone.&lt;BR /&gt;&lt;BR /&gt;I have also increased the arrays involved and it seems still ok.&lt;BR /&gt;&lt;BR /&gt;I need yet to compare the results against a standard case to see&lt;BR /&gt;if the numbers at the end of each step are the same.&lt;BR /&gt;&lt;BR /&gt;Thanks again Jim.&lt;BR /&gt;--&lt;BR /&gt;</description>
      <pubDate>Thu, 26 Feb 2009 19:59:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753263#M9043</guid>
      <dc:creator>arktos</dc:creator>
      <dc:date>2009-02-26T19:59:03Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753264#M9044</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/415968"&gt;arktos&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;Hello All,&lt;BR /&gt;&lt;BR /&gt;I have what it appears to be a common segfault problem with openmp, but the&lt;BR /&gt;it seems that the ulimit and KMP_STACKSIZE (and the variants i've seen based &lt;BR /&gt;on these) do not have an effect. I wonder if anyone has a suggestion about this.&lt;BR /&gt;&lt;BR /&gt;Background: System Q9650 with 8GB running OpenSuse 11.1 64. Compiler: ifort 11.0.&lt;BR /&gt;Code: Fortran 77 written originally to run on cray (ymp and later) and being modified&lt;BR /&gt;to run with openmp on intel procs.&lt;BR /&gt;&lt;BR /&gt;Problem: Segfaults when the memory use increases. Web search remedies appear&lt;BR /&gt;ineffective.&lt;BR /&gt;&lt;BR /&gt;The code below is a minimal version that produces the problem. It was identified and&lt;BR /&gt;extracted from the program.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;/EM&gt;&lt;PRE&gt;&lt;EM&gt;[plain]      Common / WORK / W1(20,20,21,21),W2(20,20,21,21),W3(20,20,21,21)&lt;BR /&gt;&lt;BR /&gt;....&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;c  NOTE:   N1= 64   N2=N3=21     M2=M3=10&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;!$OMP  PARALLEL DEFAULT(SHARED)&lt;BR /&gt;!$OMP+ PRIVATE( Iw, JEL, KEL, JGLL, KGLL, W1, W2, W3 )&lt;BR /&gt;!$OMP DO&lt;BR /&gt;      Do Iw = 1, N1&lt;BR /&gt;c&lt;BR /&gt;c - Interface condition of the PRESSURE GRADIENT.&lt;BR /&gt;c&lt;BR /&gt;      Do KEL  = 2, N3&lt;BR /&gt;      Do JEL  = 1, N2&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      W1(JGLL,1,JEL,KEL) = W1(JGLL,1,JEL,KEL) + W1(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;      W2(JGLL,1,JEL,KEL) = W2(JGLL,1,JEL,KEL) + W2(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;      W3(JGLL,1,JEL,KEL) = W3(JGLL,1,JEL,KEL) + W3(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;!$OMP END DO&lt;BR /&gt;!$OMP END PARALLEL&lt;BR /&gt;[/plain]&lt;/EM&gt;&lt;/PRE&gt;
&lt;BR /&gt;&lt;BR /&gt;Any ideas what may be the matter?&lt;BR /&gt;&lt;BR /&gt;Thank you.&lt;BR /&gt;--&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Arktos,&lt;BR /&gt;&lt;BR /&gt;Set asside the SEGFAULT issue for a moment and look at the code sample you presented&lt;BR /&gt;&lt;BR /&gt;Ask yourself: What do you expect the PRIVATE clause will be doing for you for W1, W2 and W3?&lt;BR /&gt;&lt;BR /&gt;Answer: create seperate arrays for use by the additional threads past the current thread. These additional arrays will be un-initialized.&lt;BR /&gt;&lt;BR /&gt;Ask yourself: Why are you using un-initialized arrays to compute your pressure gradient?&lt;BR /&gt;&lt;BR /&gt;And why is JGLL iterating over 1/2 the cells?&lt;BR /&gt;&lt;BR /&gt;To address the first concern, you code as if the W1, W2, W3 data are to be preserved per thread, from parallel region to parallel region. If so then these data must reside in thread private area (or be obtained from arrays indexed off of thread number (careful of nested OpenMP levels)). I addressed the issue of thread private data earlier.&lt;BR /&gt;&lt;BR /&gt;But Wait.&lt;BR /&gt;&lt;BR /&gt;Then you issue an !$OMP DO inside the parallel region. This means a portion of the DO Iw iteration space will be run by each thread. This implies that a thread specific portion of each un-initialized array of W1, W2, W3 will be run independently. So you are producting parts of the undefined data from your uninitialized data.&lt;BR /&gt;&lt;BR /&gt;I think you do not want W1, W2, W3 as private, but not seeing all your code, it is hard to give good advice.&lt;BR /&gt;&lt;BR /&gt;Once you resolve the PRIVATE issue, then if SEGFAULT remains, I suggest you simplify the parallel region interaction with the main code by making the thriple loop into a subroutine, place at bottom of file and call it from where it came. Note, you can pass in the arguments M2, N2, N3, W1, W2, W3.&lt;BR /&gt;</description>
      <pubDate>Fri, 27 Feb 2009 02:13:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753264#M9044</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-02-27T02:13:15Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753265#M9045</link>
      <description>&lt;DIV style="margin: 0px; height: auto;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;&lt;BR /&gt;Jim,&lt;BR /&gt;&lt;BR /&gt;What u say is ok but as I mentioned in the original post the section of code&lt;BR /&gt;that I posted here was a minimal code that reproduced the problem. It does&lt;BR /&gt;no useful work as it was shown here.&lt;BR /&gt;&lt;BR /&gt;Originally I did not want to post the complete subroutine so as not to place too &lt;BR /&gt;much strain on people who wanted to go thru it.&lt;BR /&gt;&lt;BR /&gt;Yes W1, W2 and W3 are temporary storage arrays and are initialised and used as shown&lt;BR /&gt;in the complete subroutine below. (threadprivate may be an overkill here). &lt;BR /&gt;&lt;BR /&gt;I located the problem area by selectively commenting out successively each loop. &lt;BR /&gt;&lt;BR /&gt;I should mention as of now I have some numerical differences in the solution and I am&lt;BR /&gt;looking into this.&lt;BR /&gt;&lt;BR /&gt;Here is the original parallel section of the subroutine:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[plain]      Subroutine Newu(Nx,Ny,Nz,NLy,NLZ,U,V,W,P)&lt;BR /&gt;c     -----------------------------------------&lt;BR /&gt;c&lt;BR /&gt;c&lt;BR /&gt;      IMPLICIT REAL*8 (A-H,O-Z)&lt;BR /&gt;      include 'blkio.h'&lt;BR /&gt;      include 'blkdata.h'&lt;BR /&gt;      include 'blkwork.h'&lt;BR /&gt;      include 'blktable1.h'&lt;BR /&gt;      include 'blktable2.h'&lt;BR /&gt;      include 'blktable3.h'&lt;BR /&gt;c&lt;BR /&gt;c&lt;BR /&gt;      Real*8 U(NX,NY,NZ,NLY,NLZ), V(NX,NY,NZ,NLY,NLZ),&lt;BR /&gt;     .       W(NX,NY,NZ,NLY,NLZ), P(NY*NZ*NLY*NLZ,NX)&lt;BR /&gt;&lt;BR /&gt;      include 'omp_lib.h'&lt;BR /&gt;c&lt;BR /&gt;      Write(ndout,'(a)')' * Newu : IN '&lt;BR /&gt;c&lt;BR /&gt;!$OMP  PARALLEL DEFAULT(SHARED)&lt;BR /&gt;!$OMP+ PRIVATE(I,Iw,IIw,IID,JEL,KEL,JGLL,KGLL,NBLOCK,LBLK,DYH,DZH,RYZ,&lt;BR /&gt;!$OMP+         Wv,DPDX,DPDY,DPDZ,PMN,PMM,W1,W2,W3 )&lt;BR /&gt;&lt;BR /&gt;c     KEL = omp_get_max_threads()&lt;BR /&gt;c     write(6,*)' -- MAX THREADS =', KEL&lt;BR /&gt;&lt;BR /&gt;c - Go through each streamwise wavenumber.&lt;BR /&gt;&lt;BR /&gt;!$OMP DO&lt;BR /&gt;      Do Iw = 1, N1&lt;BR /&gt;c&lt;BR /&gt;      IID = IIDA( Iw )&lt;BR /&gt;c&lt;BR /&gt;      IIw = Iw&lt;BR /&gt;      If( Iw .eq. 1 ) IIw = 2&lt;BR /&gt;c&lt;BR /&gt;      NBLOCK = 0&lt;BR /&gt;      Do KEL = 1, N3&lt;BR /&gt;      Do JEL = 1, N2&lt;BR /&gt;c&lt;BR /&gt;      NBLOCK = NBLOCK + 1&lt;BR /&gt;      LBLK   = (NBLOCK-1)*NPEL&lt;BR /&gt;c&lt;BR /&gt;      DYH = 0.5D0*DYELM(JEL,KEL)&lt;BR /&gt;      DZH = 0.5D0*DZELM(JEL,KEL)&lt;BR /&gt;c&lt;BR /&gt;c&lt;BR /&gt;c - Calculate the PRESSURE GRADIENTS.&lt;BR /&gt;c&lt;BR /&gt;      Do KGLL = 1, M3&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      DPDX = 0.0D0&lt;BR /&gt;      DPDY = 0.0D0&lt;BR /&gt;      DPDZ = 0.0D0&lt;BR /&gt;c&lt;BR /&gt;      Do I = 1, NPEL&lt;BR /&gt;c&lt;BR /&gt;         PMN = P(LBLK+I,Iw)&lt;BR /&gt;         PMM = P(LBLK+I,IIw+IID)&lt;BR /&gt;c&lt;BR /&gt;         DPDX = DPDX + PMM*DPDXM(I,JGLL,KGLL)&lt;BR /&gt;         DPDY = DPDY + PMN*DPDYM(I,JGLL,KGLL)&lt;BR /&gt;         DPDZ = DPDZ + PMN*DPDZM(I,JGLL,KGLL)&lt;BR /&gt;c&lt;BR /&gt;      END DO&lt;BR /&gt;c&lt;BR /&gt;c - Note : no negative sign for the x derivative since both the&lt;BR /&gt;c - multiplication by the complex unity and the use of the sign&lt;BR /&gt;c - flag IID produces the correct overall sign.&lt;BR /&gt;c&lt;BR /&gt;      W1(JGLL,KGLL,JEL,KEL) = IID * DT * DYH * DZH * Wv * DPDX&lt;BR /&gt;      W2(JGLL,KGLL,JEL,KEL) =       DT * DZH * DPDY&lt;BR /&gt;      W3(JGLL,KGLL,JEL,KEL) =       DT * DYH * DPDZ&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;c - Interface condition of the PRESSURE GRADIENT.&lt;BR /&gt;c&lt;BR /&gt;      Do KEL  = 2, N3&lt;BR /&gt;      Do JEL  = 1, N2&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;      W1(JGLL,1,JEL,KEL) = W1(JGLL,1,JEL,KEL) + W1(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;      W2(JGLL,1,JEL,KEL) = W2(JGLL,1,JEL,KEL) + W2(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;      W3(JGLL,1,JEL,KEL) = W3(JGLL,1,JEL,KEL) + W3(JGLL,M3,JEL,KEL-1)&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;&lt;BR /&gt;c&lt;BR /&gt;      Do KEL  = 1, N3-1&lt;BR /&gt;         Do JEL  = 1, N2&lt;BR /&gt;            Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;c              W1(JGLL,M3,JEL,KEL) = W1(JGLL,1,JEL,KEL+1)&lt;BR /&gt;c              W2(JGLL,M3,JEL,KEL) = W2(JGLL,1,JEL,KEL+1)&lt;BR /&gt;c              W3(JGLL,M3,JEL,KEL) = W3(JGLL,1,JEL,KEL+1)&lt;BR /&gt;c&lt;BR /&gt;            End Do&lt;BR /&gt;         End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;&lt;BR /&gt;c&lt;BR /&gt;c     Do JEL  = 2, N2&lt;BR /&gt;c     Do KEL  = 1, N3&lt;BR /&gt;c     Do KGLL = 1, M3&lt;BR /&gt;c&lt;BR /&gt;c     W1(1,KGLL,JEL,KEL) = W1(1,KGLL,JEL,KEL) + W1(M2,KGLL,JEL-1,KEL)&lt;BR /&gt;c     W2(1,KGLL,JEL,KEL) = W2(1,KGLL,JEL,KEL) + W2(M2,KGLL,JEL-1,KEL)&lt;BR /&gt;c     W3(1,KGLL,JEL,KEL) = W3(1,KGLL,JEL,KEL) + W3(M2,KGLL,JEL-1,KEL)&lt;BR /&gt;c&lt;BR /&gt;c     End Do&lt;BR /&gt;c     End Do&lt;BR /&gt;c     End Do&lt;BR /&gt;c&lt;BR /&gt;c&lt;BR /&gt;      Do JEL  = 1, N2-1&lt;BR /&gt;         Do KEL  = 1, N3&lt;BR /&gt;            Do KGLL = 1, M3&lt;BR /&gt;c              W1(M2,KGLL,JEL,KEL) = W1(1,KGLL,JEL+1,KEL)&lt;BR /&gt;c              W2(M2,KGLL,JEL,KEL) = W2(1,KGLL,JEL+1,KEL)&lt;BR /&gt;c              W3(M2,KGLL,JEL,KEL) = W3(1,KGLL,JEL+1,KEL)&lt;BR /&gt;            End Do&lt;BR /&gt;         End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;c&lt;BR /&gt;c - Find the NEW VELOCITIES.&lt;BR /&gt;c&lt;BR /&gt;      Do KEL = 1, N3&lt;BR /&gt;      Do JEL = 1, N2&lt;BR /&gt;c&lt;BR /&gt;      Do KGLL = 1, M3&lt;BR /&gt;      Do JGLL = 1, M2&lt;BR /&gt;c&lt;BR /&gt;c     RYZ = RBI(JGLL,KGLL,JEL,KEL)&lt;BR /&gt;c     DPDX = W1(JGLL,KGLL,JEL,KEL)&lt;BR /&gt;c     DPDY = W2(JGLL,KGLL,JEL,KEL)&lt;BR /&gt;c     DPDZ = W3(JGLL,KGLL,JEL,KEL)&lt;BR /&gt;c&lt;BR /&gt;c     U(Iw,JGLL,KGLL,JEL,KEL) = RYZ*( U(Iw,JGLL,KGLL,JEL,KEL) + DPDX )&lt;BR /&gt;c     V(Iw,JGLL,KGLL,JEL,KEL) = RYZ*(  V(Iw,JGLL,KGLL,JEL,KEL) + DPDY )&lt;BR /&gt;c     W(Iw,JGLL,KGLL,JEL,KEL) = RYZ*( W(Iw,JGLL,KGLL,JEL,KEL) + DPDZ )&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;      End Do&lt;BR /&gt;c&lt;BR /&gt;c&lt;BR /&gt;      End Do&lt;BR /&gt;!$OMP END DO&lt;BR /&gt;!$OMP END PARALLEL&lt;BR /&gt;c - End of Iw loop.&lt;BR /&gt;[/plain]&lt;/PRE&gt;
&lt;BR /&gt;....&lt;BR /&gt;&lt;BR /&gt; Return&lt;BR /&gt; End&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 27 Feb 2009 08:39:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753265#M9045</guid>
      <dc:creator>arktos</dc:creator>
      <dc:date>2009-02-27T08:39:29Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753266#M9046</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Arktos,&lt;BR /&gt;&lt;BR /&gt;Thread private (persistant) copies of W1, W2, W3 would be best when Newu is called many times. This would reduce possibilities of memory fragmentation during runtime.&lt;BR /&gt;&lt;BR /&gt;Try this&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[cpp]Subroutine Newu(Nx,Ny,Nz,NLy,NLZ,U,V,W,P)
...
! *** remove COMMON declaration of W1, W2, W3
! add to subrouting scope
! *** but do not allocate

real*8, allocatable :: W1(:,:,:,:), W2(:,:,:,:), W3(:,:,:,:)
...

!$OMP  PARALLEL DEFAULT(SHARED)
!$OMP+ PRIVATE(I,Iw,IIw,IID,JEL,KEL,JGLL,KGLL,NBLOCK,LBLK,DYH,DZH,RYZ,
!$OMP+         Wv,DPDX,DPDY,DPDZ,PMN,PMM,W1,W2,W3 )

! *** add allocation here
	allocate(W1(M2,M3,N2,N3), STAT = I)
	if(I .ne. 0) call YourFatalMemoryAllocationRoutine()

	allocate(W2(M2,M3,N2,N3), STAT = I)
	if(I .ne. 0) call YourFatalMemoryAllocationRoutine()

	allocate(W3(M2,M3,N2,N3), STAT = I)
	if(I .ne. 0) call YourFatalMemoryAllocationRoutine()

c     KEL = omp_get_max_threads()
c     write(6,*)' -- MAX THREADS =', KEL
c - Go through each streamwise wavenumber.
!$OMP DO
      Do Iw = 1, N1
	...&lt;BR /&gt;!$OMP END DO&lt;BR /&gt;&lt;BR /&gt;   deallocate(W3)&lt;BR /&gt;   deallocate(W2)&lt;BR /&gt;   deallocate(W1)
!$OMP END PARALLEL&lt;BR /&gt;[/cpp]&lt;/PRE&gt;
&lt;BR /&gt;Jim Dempsey&lt;BR /&gt;</description>
      <pubDate>Fri, 27 Feb 2009 13:45:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753266#M9046</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-02-27T13:45:51Z</dc:date>
    </item>
    <item>
      <title>Re: SEGFAULT with OpenMP. Stack Problem?</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753267#M9047</link>
      <description>&lt;DIV style="margin: 0px; height: auto;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Jim,&lt;BR /&gt;&lt;BR /&gt;Your latest suggestion removes&lt;STRONG&gt;&lt;/STRONG&gt; the segfault too. Also, the numerical values seem&lt;BR /&gt;in agreement with a standard case I have for checking the results.&lt;BR /&gt;&lt;BR /&gt;I will have to make a few more tests and then I will flag the thread as having&lt;BR /&gt;resolved the problem.&lt;BR /&gt;&lt;BR /&gt;This will be the second of the four computationally intensive subroutines that &lt;BR /&gt;will be using multithreading in this code. The remaining two are more complicated &lt;BR /&gt;and it is quite likely that something interesting will show up. And then I have 4 more&lt;BR /&gt;simulation codes to multithread...&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Many thanks to everyone for their contribution. &lt;BR /&gt;--&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 03 Mar 2009 01:04:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/SEGFAULT-with-OpenMP-Stack-Problem/m-p/753267#M9047</guid>
      <dc:creator>arktos</dc:creator>
      <dc:date>2009-03-03T01:04:43Z</dc:date>
    </item>
  </channel>
</rss>

