<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic I am doubtful that it is that in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952431#M92545</link>
    <description>&lt;P&gt;I am doubtful that it is that optimizations assume multiple threads. There can be many possible explanations for this behavior. Can you provide us with a test program that demonstrates the problem.&lt;/P&gt;</description>
    <pubDate>Wed, 08 Jan 2014 16:41:14 GMT</pubDate>
    <dc:creator>Steven_L_Intel1</dc:creator>
    <dc:date>2014-01-08T16:41:14Z</dc:date>
    <item>
      <title>Optimizations dependent on availability of OpenMP threads</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952430#M92544</link>
      <description>&lt;P&gt;I have a working environment where $OMP_NUM_THREADS=1 is enforced (login node), but the system has many more available threads. It seems that when -O2 and -O3 optimizations are included in my compile command, the optimizations hard code instructions assuming OpenMP thread availability based on the host system, or at least the optimizations prevent the graceful handling of $OMP_NUM_THREADS. On execution, I get a segfault on entering __kmp_enter_single().&lt;/P&gt;

&lt;P&gt;[bash]home$ ifort diagonalize.f90 -g -debug -assume buffered_io -ipo -fpic -openmp &lt;STRONG&gt;-O2&lt;/STRONG&gt; -I$MKLROOT/include/intel64/lp64 -I$MKLROOT/include $MKLROOT/lib/intel64/libmkl_blas95_lp64.a $MKLROOT/lib/intel64/libmkl_lapack95_lp64.a -Wl,--start-group $MKLROOT/lib/intel64/libmkl_intel_lp64.a $MKLROOT/lib/intel64/libmkl_core.a $MKLROOT/lib/intel64/libmkl_intel_thread.a -Wl,--end-group -lpthread -lm -o diagonalize&lt;BR /&gt;
	ipo: remark #11000: performing multi-file optimizations&lt;BR /&gt;
	ipo: remark #11006: generating object file /tmp/ipo_ifortuRFRzr.o&lt;BR /&gt;
	home$ diagonalize&lt;BR /&gt;
	forrtl: severe (174): SIGSEGV, segmentation fault occurred&lt;BR /&gt;
	Image&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; PC&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Routine&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Line&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Source&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	libiomp5.so&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 00002B52D2AFB71A&amp;nbsp; Unknown&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Unknown&amp;nbsp; Unknown&lt;BR /&gt;
	libiomp5.so&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 00002B52D2ADEE16&amp;nbsp; Unknown&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Unknown&amp;nbsp; Unknown&lt;BR /&gt;
	diagonalize_16&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0000000000411514&amp;nbsp; Unknown&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Unknown&amp;nbsp; Unknown&lt;BR /&gt;
	libiomp5.so&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 00002B52D2B20FE3&amp;nbsp; Unknown&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Unknown&amp;nbsp; Unknown&lt;BR /&gt;
	home$ idbc diagonalize&lt;BR /&gt;
	Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [79.936.23]&lt;BR /&gt;
	------------------&lt;BR /&gt;
	object file name: diagonalize&lt;BR /&gt;
	Reading symbols from /home/me/diagonalize...done.&lt;BR /&gt;
	(idb) run&lt;BR /&gt;
	Starting program: /home/me/diagonalize&lt;BR /&gt;
	[New Thread 26293 (LWP 26293)]&lt;BR /&gt;
	Program received signal SIGSEGV&lt;BR /&gt;
	__kmp_enter_single () in /opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so&lt;BR /&gt;
	(idb) where&lt;BR /&gt;
	#0&amp;nbsp; 0x00002b92e998671a in __kmp_enter_single () in /opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so&lt;BR /&gt;
	#1&amp;nbsp; 0x00002b92e9969e16 in __kmpc_single () in /opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so&lt;BR /&gt;
	#2&amp;nbsp; 0x0000000000411514 in diagonalize () at /home/me/diagonalize.f90:403&lt;BR /&gt;
	#3&amp;nbsp; 0x000000000040f6ed in diagonalize () at /home/me/diagonalize.f90:333&lt;BR /&gt;
	#4&amp;nbsp; 0x000000000040dfec in main () in /home/me/diagonalize&lt;BR /&gt;
	#5&amp;nbsp; 0x00000038cc81ecdd in __libc_start_main () in /lib64/libc-2.12.so&lt;BR /&gt;
	(idb) set $cmdset='dbx'&lt;BR /&gt;
	(idb) file diagonalize.f90&lt;BR /&gt;
	(idb) list 403,406&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; 403&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; !$omp workshare&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; 404&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ! zero result array&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; 405&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; resArray(ptr(2):ptr(2)+dim_matrix-1) = 0_dbl_real&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; 406&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; !$omp end workshare&lt;BR /&gt;
	[/bash]&lt;/P&gt;

&lt;P&gt;If I only use -O1 level optimizations, the program runs fine. However, this is a HPC environment and for real data sets I will need at least -O2 functioning.&lt;/P&gt;

&lt;P&gt;Also, the segfault is usually present and rarely absent from this executable. I'm fairly certain it's due to varying load on the host system and thus, thread availability.&lt;/P&gt;

&lt;P&gt;As a side note, the source works fine with gfortran and the GOMP library with -O2 and -O3 (main point is that there's nothing odd with this code).&lt;/P&gt;

&lt;P&gt;Any suggestions?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Jonathan&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2014 05:11:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952430#M92544</guid>
      <dc:creator>Jonathan_B_</dc:creator>
      <dc:date>2014-01-08T05:11:27Z</dc:date>
    </item>
    <item>
      <title>I am doubtful that it is that</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952431#M92545</link>
      <description>&lt;P&gt;I am doubtful that it is that optimizations assume multiple threads. There can be many possible explanations for this behavior. Can you provide us with a test program that demonstrates the problem.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2014 16:41:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952431#M92545</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2014-01-08T16:41:14Z</dc:date>
    </item>
    <item>
      <title>From your error traceback it</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952432#M92546</link>
      <description>&lt;P&gt;From your error traceback it seams as if the call to __kmp_enter_single caused a stack overflow.&lt;/P&gt;

&lt;P&gt;Possible causes are:&lt;/P&gt;

&lt;P&gt;a) too small of stack&lt;BR /&gt;
	b) "resArray" is a pointer/reference with stride other than 1 .AND. resArray(...) = attempting to create stack temporary such that __intel_fast_memset can be called.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2014 19:00:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952432#M92546</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-01-08T19:00:10Z</dc:date>
    </item>
    <item>
      <title>Thank you both for weighing</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952433#M92547</link>
      <description>&lt;P&gt;Thank you both for weighing in - your comments prompted me to step through the instructions to track down where the program fails.&lt;/P&gt;

&lt;P&gt;[bash]stopped at [__kmp_enter_single0x00002ad8dc13b709]&amp;nbsp;&amp;nbsp; &amp;nbsp;movq&amp;nbsp; 0x284cf8(%rip), %rax&lt;BR /&gt;
	(idb) stepi&lt;BR /&gt;
	stopped at [__kmp_enter_single0x00002ad8dc13b710]&amp;nbsp;&amp;nbsp; &amp;nbsp;movsxd %r12d, %rbx&lt;BR /&gt;
	(idb) stepi&lt;BR /&gt;
	stopped at [__kmp_enter_single0x00002ad8dc13b713]&amp;nbsp;&amp;nbsp; &amp;nbsp;movq&amp;nbsp; (%rax), %rdx&lt;BR /&gt;
	(idb) stepi&lt;BR /&gt;
	stopped at [__kmp_enter_single0x00002ad8dc13b716]&amp;nbsp;&amp;nbsp; &amp;nbsp;movq&amp;nbsp; (%rdx,%rbx,8), %rdx&lt;BR /&gt;
	(idb) stepi&lt;BR /&gt;
	stopped at [__kmp_enter_single0x00002ad8dc13b71a]&amp;nbsp;&amp;nbsp; &amp;nbsp;movq&amp;nbsp; 0x40(%rdx), %rax&lt;BR /&gt;
	(idb)stepi&lt;BR /&gt;
	Thread received signal SEGV&lt;BR /&gt;
	stopped at [__kmp_enter_single0x00002ad8dc13b71a]&amp;nbsp;&amp;nbsp; &amp;nbsp;movq&amp;nbsp; 0x40(%rdx), %rax[/bash]&lt;/P&gt;

&lt;P&gt;So it's failing to move some data from memory to a register.&lt;/P&gt;

&lt;P&gt;Unfortunately, I haven't had success in creating a simple test program that exhibits the behavior - I suspect I would need the heuristics to match the main program. I can provide the complete source and single dependency (with makefile), and a small input data file, but should I upload to the forum or submit by PM? All told, this amounts to 1 MB (compressed).&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Jonathan&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2014 08:15:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952433#M92547</guid>
      <dc:creator>Jonathan_B_</dc:creator>
      <dc:date>2014-01-09T08:15:06Z</dc:date>
    </item>
    <item>
      <title>From the code:</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952434#M92548</link>
      <description>&lt;P&gt;From the code:&lt;/P&gt;

&lt;P&gt;movq&amp;nbsp; 0x284cf8(%rip), %rax // get address of global pointer to pointer&lt;BR /&gt;
	movsxd %r12d, %rbx // convert signed double index to qword&lt;BR /&gt;
	movq&amp;nbsp; (%rax), %rdx // get address of pointer (indirect first reference)&lt;BR /&gt;
	movq&amp;nbsp; (%rdx,%rbx,8), %rdx // get array[index] (containing pointer to object)&lt;BR /&gt;
	movq&amp;nbsp; 0x40(%rdx), %rax // reference member variable at 0x40 offset from object (** error due to invalid address in rdx)&lt;/P&gt;

&lt;P&gt;From the looks of what the code is doing I would say it is referencing internal tables of OpenMP in order to perform the enter of a Single section. My best guess is that some code earlier than the !$omp workshare corrupted the internal tables of OpenMP.&lt;/P&gt;

&lt;P&gt;Look for an earlier piece of code that:&lt;/P&gt;

&lt;P&gt;a) &amp;nbsp;is indexing an array out of bounds on left side of =&lt;BR /&gt;
	b) using an uninitialized pointer on left side of =&lt;BR /&gt;
	c) using value as reference&amp;nbsp; on left side of = (perhaps from return from call to C/C++ function)&lt;BR /&gt;
	d) Assuming returned pointer is valid when call takes error return (same as c)&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2014 19:20:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952434#M92548</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-01-09T19:20:00Z</dc:date>
    </item>
    <item>
      <title>On a hunch, try:</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952435#M92549</link>
      <description>&lt;P&gt;On a hunch, try:&lt;/P&gt;

&lt;P&gt;if(OMP_IN_PARALLEL() then&lt;BR /&gt;
	!$omp workshare&lt;BR /&gt;
	resArray(ptr(2):ptr(2)+dim_matrix-1) = 0_dbl_real&lt;BR /&gt;
	!$omp end workshare&lt;BR /&gt;
	else&lt;BR /&gt;
	resArray(ptr(2):ptr(2)+dim_matrix-1) = 0_dbl_real&lt;BR /&gt;
	endif&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2014 19:26:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952435#M92549</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-01-09T19:26:50Z</dc:date>
    </item>
    <item>
      <title>I would suggest using Intel</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952436#M92550</link>
      <description>&lt;P&gt;I would suggest using Intel Premier Support to report the problem and attach sources, if Jim's suggestions don't help. But I suspect he is on the right track.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2014 20:26:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952436#M92550</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2014-01-09T20:26:53Z</dc:date>
    </item>
    <item>
      <title>I tried the hunch you</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952437#M92551</link>
      <description>&lt;P&gt;I tried the hunch you suggested Jim, but it seems that the parallel environment is established prior to the workshare section. Since the initialization is being converted to _intel_fast_memset, I tried changing the workshare directive to a single directive. No luck - segmentation fault in __kmp_enter_single(). I may try substituting _intel_fast_memset, but since it will have to be in an omp single environment I'm not convinced I would have different results.&lt;/P&gt;

&lt;P&gt;I added -check bounds, and when that reported no error, I confirmed that all accesses to arrays were within bounds using idb.&lt;/P&gt;

&lt;P&gt;I started to export all instructions from the initialization of the parallel environment onward, but stopped when I realized it exceeded 3000 lines. I can pick up assembly quickly enough, but that's too much material to work with right off the bat.&lt;/P&gt;

&lt;P&gt;I'm going to do some more analysis and when I have more data I'll start a new thread. Thank you both for your input.&lt;/P&gt;

&lt;P&gt;Jonathan&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jan 2014 21:32:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952437#M92551</guid>
      <dc:creator>Jonathan_B_</dc:creator>
      <dc:date>2014-01-13T21:32:56Z</dc:date>
    </item>
    <item>
      <title>When an application is built</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952438#M92552</link>
      <description>&lt;P&gt;When an application is built with -openmp, the OpenMP run-time library may create an additional&amp;nbsp;monitor thread. So even if OMP_NUM_THREADS is set to 1, there may be more than one thread associated with the process. Could that be an issue on your login node?&lt;/P&gt;

&lt;P&gt;Earlier, Jim mentioned stack size limits. Whilst that doesn't sound like the issue, -openmp does cause local arrays to be stored on the stack instead of on the heap. You could try -auto, which does the same thing without OpenMP, to see if this triggers an error, or you could increase the stack limit, e.g. with ulimit -s unlimited (bash &amp;amp; similar shells). There&amp;nbsp;are some suggestions for&amp;nbsp;debugging Fortran OpenMP applications at &lt;A href="http://software.intel.com/en-us/articles/threading-fortran-applications-for-parallel-performance-on-multi-core-systems/"&gt;http://software.intel.com/en-us/articles/threading-fortran-applications-for-parallel-performance-on-multi-core-systems/&lt;/A&gt;&amp;nbsp;.&lt;/P&gt;

&lt;P&gt;Finally, note that in the current Intel compiler, WORKSHARE is implemented with a single thread, i.e., there is no real sharing of work between threads, even when OMP_NUM_THREADS &amp;gt; 1. That's why you see a call to __kmp_enter_single. So for the current Intel compiler, there's little point in coding an OpenMP PARALLEL WORKSHARE construct, unless you want that for other platforms or compilers. Removing it might allow you to workaround the immediate problem, for no loss of performance, especially if the problem really is related just to the WORKSHARE implementation.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;A multi-threaded version of WORKSHARE may be implemented in a future compiler.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jan 2014 21:40:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952438#M92552</guid>
      <dc:creator>Martyn_C_Intel</dc:creator>
      <dc:date>2014-01-13T21:40:09Z</dc:date>
    </item>
    <item>
      <title>Hi Martyn, thanks for your</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952439#M92553</link>
      <description>&lt;P&gt;Hi Martyn, thanks for your input. I do not know the extent of measures used to restrict threads available to users on the login nodes of this system. Since I've successfully run other OpenMP parallel test programs previously, through, I don't think the monitor thread is to blame. I just noticed issues in a previous forum thread where&lt;BR /&gt;
	&amp;lt;code&amp;gt;do i=1,omp_get_num_threads()&lt;BR /&gt;
	...&lt;BR /&gt;
	end do&amp;lt;/code&amp;gt;&lt;BR /&gt;
	did not check the loop bounds in a parallel construct, but&lt;BR /&gt;
	&amp;lt;code&amp;gt;do i=omp_get_thread_num()+1, omp_get_num_threads(), omp_get_num_threads()&lt;BR /&gt;
	...&lt;BR /&gt;
	end do&amp;lt;/code&amp;gt;&lt;BR /&gt;
	functioned fine (also in this program, solution suggested by Jim Dempsey). This behavior led me to believe that instructions for the number of available threads on the system were written into the assembly, but perhaps conditionally executed (and with optimizations enabled, the conditionals were not functioning properly). Regardless, this was the original impetus for my concerns regarding thread number.&lt;/P&gt;

&lt;P&gt;In my program, all arrays are dynamically allocated (all large), so it's not possible for them to be on the stack to the best of my knowledge. Using -auto rather than -openmp produces a fully functioning binary, so the stack size limit is not the issue. I'm using the source code in two environments - the compute server is Xeon based and I'm using the Intel Fortran compiler there; my workstation has an AMD chip, so I'm using gfortran there. I tried changing the workshare directive to a single directive and had the same results (segmentation fault). Since the array is shared, the initialization instruction(s) have to be in either a workshare or single construct. It's not feasible to exit and reenter the parallel environment because this is running in a loop until an exit flag is triggered (and there are some large thread private allocated arrays that would have to be reallocated each time). I will look at the link you provided in depth.&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Jonathan&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jan 2014 16:28:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952439#M92553</guid>
      <dc:creator>Jonathan_B_</dc:creator>
      <dc:date>2014-01-14T16:28:45Z</dc:date>
    </item>
    <item>
      <title>My only other thought is that</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952440#M92554</link>
      <description>&lt;P&gt;My only other thought is that you are linking to the threaded version of MKL. Perhaps, in this context of login nodes, you should either link to the serial version, or else set an MKL environment variable to limit MKL to a single thread, though I don't see why that would impact this particular workshare or single construct. Otherwise, if you don't spot any other issues, we'd probably need to see an example, as Steve suggested.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Of the suggestions in that link, trying Intel Inspector XE might be the most hopeful.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jan 2014 22:07:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952440#M92554</guid>
      <dc:creator>Martyn_C_Intel</dc:creator>
      <dc:date>2014-01-14T22:07:35Z</dc:date>
    </item>
    <item>
      <title>Jonathan,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952441#M92555</link>
      <description>&lt;P&gt;Jonathan,&lt;/P&gt;

&lt;P&gt;For diagnostic purposes create a subroutine along this line&lt;/P&gt;

&lt;P&gt;subroutine BoinkTest&lt;BR /&gt;
	use omp&lt;BR /&gt;
	if(.NOT. omp_in_parallel()) then&lt;BR /&gt;
	!$omp parallel&lt;BR /&gt;
	call BoinkTest&lt;BR /&gt;
	!$omp end parallel&lt;BR /&gt;
	else&lt;BR /&gt;
	!$omp workshare&lt;BR /&gt;
	aFooArray = 0.0&lt;BR /&gt;
	!$omp end workshare&lt;BR /&gt;
	endif&lt;BR /&gt;
	endsubroutine&lt;/P&gt;

&lt;P&gt;The insert calls to this subroutine throughout your program (reverse binary search if run to crash is fast).&lt;/P&gt;

&lt;P&gt;The object of the procedure is to narrow the search of the section of code causing the problem (presumed corruption of OpenMP data).&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jan 2014 23:43:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952441#M92555</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-01-14T23:43:12Z</dc:date>
    </item>
    <item>
      <title>I cannot be sure what caused</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952442#M92556</link>
      <description>&lt;P&gt;I cannot be sure what caused the change in behavior, but I made a few modifications and the resulting binary seems to be free of the segmentation fault. Following Martyn's recommendation, I added '–diag-enable sc-parallel3' to the compile line, and saw a slew of errors and warnings - some valid and some not. Since I have a large parallel construct with multiple single constructs within and a main parallel do construct, it seems that the error checking does not follow the state of variables through the entire scope of a parallel construct, otherwise the warnings about deallocating a private array before leaving the parallel construct would not be printed. Still, the following interested me:&lt;/P&gt;

&lt;P&gt;&amp;lt;plain&amp;gt;warning #12278: there is a case where this worksharing construct is not enclosed dynamically within a parallel region in order to execute in parallel&amp;lt;/plain&amp;gt;&lt;/P&gt;

&lt;P&gt;Considering that, I thought I would try changing all error catching code that terminates immediately to use an exit flag and jump to the end of the parallel construct, halting after the dynamic parallel region. That seems to have solved the problem. Am I mistaken that issuing 'stop' is acceptable in a parallel region?&lt;/P&gt;

&lt;P&gt;I'm going to try Jim's suggestion on the older version to check if the instructions associated with the error checking code cause the problem. If this is an API violation I was unaware of, then I'll refrain from doing it in the future. Otherwise, this may help me build a minimal code to demonstrate the problem.&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Jonathan&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jan 2014 08:00:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952442#M92556</guid>
      <dc:creator>Jonathan_B_</dc:creator>
      <dc:date>2014-01-15T08:00:38Z</dc:date>
    </item>
    <item>
      <title>Jonathan,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952443#M92557</link>
      <description>&lt;P&gt;Jonathan,&lt;/P&gt;

&lt;P&gt;If the warning #12278 was correct in that the workshare statement could be entered outside a&amp;nbsp;parallel region then my suggestion may work, however, it also might not suppress the warning message.&lt;/P&gt;

&lt;P&gt;Have you considered changing workshare to sections?&lt;/P&gt;

&lt;P&gt;(this will not fix the execution from outside a parallel region, but it will permit some degree of parallelization)&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jan 2014 18:23:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952443#M92557</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-01-15T18:23:14Z</dc:date>
    </item>
    <item>
      <title>Jim, the warning message was</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952444#M92558</link>
      <description>&lt;P&gt;Jim, the warning message was only thrown with '–diag-enable sc-parallel3' present, and since that is greedy about throwing errors and warnings, I removed it from the compile line. It was very informative at pointing out this issue, but I've double checked the OpenMP specs and I'm pretty sure that my usage was not an API violation (section 1.2.2, line 20 "STOP statements are allowed in a structured block"). Still, this pointed out at least one issue - the implementation of the workshare directive.&lt;/P&gt;

&lt;P&gt;In this round of testing, I included the following subroutine:&lt;BR /&gt;
	[fortran]recursive Subroutine WorkshareTest(testvector,extent,begin,end)&lt;BR /&gt;
	use Hamiltonian&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ! dbl_real kind number defined here&lt;BR /&gt;
	use omp_lib&lt;BR /&gt;
	integer, intent(in) :: extent,begin,end&lt;BR /&gt;
	real(kind=dbl_real), dimension(extent), intent(inout) :: testvector&lt;BR /&gt;
	if (.not. omp_in_parallel()) then&lt;BR /&gt;
	&amp;nbsp; !$omp parallel shared(testvector,extent,begin,end)&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; call WorkshareTest(testvector,extent,begin,end)&lt;BR /&gt;
	&amp;nbsp; !$omp end parallel&lt;BR /&gt;
	else&lt;BR /&gt;
	&amp;nbsp; !$omp workshare&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; testvector(begin:end) = 0.0&lt;BR /&gt;
	&amp;nbsp; !$omp end workshare&lt;BR /&gt;
	end if&lt;BR /&gt;
	end Subroutine WorkshareTest[/fortran]&lt;/P&gt;

&lt;P&gt;The call to this subroutine is:&lt;BR /&gt;
	[fortran]call WorkshareTest(workd,size(workd),ipntr(2),ipntr(2)+dim_hamiltonian-1)[/fortran]&lt;BR /&gt;
	where:&lt;BR /&gt;
	&amp;nbsp;o&amp;nbsp; workd is the array in original workshare (space for vectors v and y used in matrix-vector product y = A*v)&lt;BR /&gt;
	&amp;nbsp;o&amp;nbsp; ipntr is an array with various runtime values; element 2 holds the index of the first element in workd that is vector y&lt;BR /&gt;
	&amp;nbsp;o&amp;nbsp; dim_hamiltonian is the dimension of my matrix, also the dimension of arrays v and y&lt;/P&gt;

&lt;P&gt;I inserted calls in between all lines after workd was allocated all the way through the parallel section. There were no errors reported by ifort. I inserted a break in idb at the first call to WorkshareTest. The following steps through the code from that point on, and includes a backtrace when the segmentation fault happens.&lt;/P&gt;

&lt;P&gt;[plain][1] stopped at [diagonalize:311, 0x000000000040ed6e] call WorkshareTest(workd,size(workd),ipntr(2),ipntr(2)+dim_hamiltonian-1)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:833, 0x000000000040ed92] recursive Subroutine WorkshareTest(testvector,extent,begin,end)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	[1] stopped at [diagonalize:311, 0x000000000040ed9b] call WorkshareTest(workd,size(workd),ipntr(2),ipntr(2)+dim_hamiltonian-1)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:833, 0x000000000040eda6] recursive Subroutine WorkshareTest(testvector,extent,begin,end)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [diagonalize:311, 0x000000000040edad] call WorkshareTest(workd,size(workd),ipntr(2),ipntr(2)+dim_hamiltonian-1)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:833, 0x000000000040edb4] recursive Subroutine WorkshareTest(testvector,extent,begin,end)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [diagonalize:309, 0x000000000040edc5] ipntr(2)=dim_hamiltonian+1&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	[1] stopped at [diagonalize:311, 0x000000000040edcf] call WorkshareTest(workd,size(workd),ipntr(2),ipntr(2)+dim_hamiltonian-1)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:833, 0x000000000040edd9] recursive Subroutine WorkshareTest(testvector,extent,begin,end)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:838, 0x000000000040ee26] if (.not. omp_in_parallel()) then&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:838, 0x000000000040ee46] if (.not. omp_in_parallel()) then&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	stopped at [worksharetest:839, 0x000000000040ee5e]&amp;nbsp;&amp;nbsp; !$omp parallel shared(testvector,extent,begin,end)&lt;BR /&gt;
	(idb) step&lt;BR /&gt;
	Thread received signal SEGV&lt;BR /&gt;
	The "step" command was not completed.&lt;BR /&gt;
	stopped at [__intel_new_memset0x00002b74a881ffb4]&lt;BR /&gt;
	(idb) where&lt;BR /&gt;
	&amp;gt;0&amp;nbsp; __intel_new_memset(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a881ffb4]&lt;BR /&gt;
	&amp;nbsp;1&amp;nbsp; _intel_fast_memset.J(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a8816cb6]&lt;BR /&gt;
	&amp;nbsp;2&amp;nbsp; ___kmp_allocate(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87a8fe6]&lt;BR /&gt;
	&amp;nbsp;3&amp;nbsp; __kmpc_serialized_parallel(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbfc0]&lt;BR /&gt;
	&amp;nbsp;4&amp;nbsp; __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dba2e]&lt;BR /&gt;
	&amp;nbsp;5&amp;nbsp; worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;6&amp;nbsp; worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;7&amp;nbsp; worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;8&amp;nbsp; worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;9&amp;nbsp; __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;10 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;11 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;12 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;13 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;14 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;15 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;16 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;17 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;18 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;19 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;20 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;21 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;22 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;23 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;24 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;25 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;26 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;27 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;28 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;29 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;30 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;31 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;32 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;33 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;34 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;35 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;36 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;37 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;38 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;39 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;40 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;41 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;42 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;43 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;44 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3]&lt;BR /&gt;
	&amp;nbsp;45 __kmp_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87dbadb]&lt;BR /&gt;
	&amp;nbsp;46 __kmpc_fork_call(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87bbbf8]&lt;BR /&gt;
	&amp;nbsp;47 worksharetest_(...) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":839, 0x00000000004155e1]&lt;BR /&gt;
	&amp;nbsp;48 worksharetest($01=, $02=, $03=, $04=) ["/home1/02146/jblair42/src/diagonalize_parallel_csr_sym.f90":840, 0x00000000004156bb]&lt;BR /&gt;
	&amp;nbsp;49 __kmp_invoke_microtask(...) ["/opt/apps/intel/13/composer_xe_2013.2.146/compiler/lib/intel64/libiomp5.so": 0x00002b74a87fefe3][/plain]&lt;/P&gt;

&lt;P&gt;Why my modification to use an exit flag as opposed to issuing 'stop' in the error checking code worked is not yet determined. However, I want to double check that the subroutine is standards-complying before using it in a minimal test case.&lt;/P&gt;

&lt;P&gt;Thank you all for helping with this. Whether this turns out to be one or two bugs (or one bug and something else), I'm finally getting some resolution on a four month old issue.&lt;/P&gt;

&lt;P&gt;Best,&lt;BR /&gt;
	Jonathan&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jan 2014 22:49:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952444#M92558</guid>
      <dc:creator>Jonathan_B_</dc:creator>
      <dc:date>2014-01-15T22:49:06Z</dc:date>
    </item>
    <item>
      <title>Yes, STOP is allowed within a</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952445#M92559</link>
      <description>&lt;P&gt;Yes, STOP is allowed within a structured block in a parallel region. I also tried a little test case with this, and it seemed to work fine.&lt;/P&gt;

&lt;P&gt;I hadn't realized you were talking about a recursive subroutine, for which the initial instance is (presumably) not in a parallel region, but subsequent ones are. That seems like a whole extra layer of complication. Perhaps you should be building with the -recursive switch, though I suspect that&amp;nbsp;compiling and linking with -openmp&amp;nbsp;may be&amp;nbsp;sufficient. Do you see seg faults if the subroutine is not recursive?&amp;nbsp;&amp;nbsp; Your traceback seems to show more recursions than I would expect from your code.&amp;nbsp;I'm not sure I've thought this through fully, but&amp;nbsp;isn't the subroutine WorkshareTest going to get called recursively by each thread, each of which then tries to zero the array in a workshare construct?&amp;nbsp;With a "single" implementation, does that mean that only one of the recursive calls will do any zeroing? I'm not clear how this will work.&lt;/P&gt;

&lt;P&gt;I would start by printing out the result of omp_in_parallel(), and if true, also the result of omp_get_thread_num and omp_get_num_threads, for each call, both outside and inside the workshare construct.&lt;/P&gt;

&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jan 2014 01:43:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952445#M92559</guid>
      <dc:creator>Martyn_C_Intel</dc:creator>
      <dc:date>2014-01-16T01:43:59Z</dc:date>
    </item>
    <item>
      <title>Martyn,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952446#M92560</link>
      <description>&lt;P&gt;Martyn,&lt;/P&gt;

&lt;P&gt;-openmp implicitly sets -recursive&lt;/P&gt;

&lt;P&gt;Also, presumably if the outer most level of WorkshareTest is called from within a parallel region that the programmer (Jonathan) would provide different and non-overlapping&amp;nbsp;begin and end values for each thread.&lt;/P&gt;

&lt;P&gt;Jonathan,&lt;/P&gt;

&lt;P&gt;The SEGV occurred at the start of a parallel region (on the !$omp parallel...)&lt;/P&gt;

&lt;P&gt;For this to occur there, either a) memory corrupted as described in #5, b) insufficient stack, or c) your application is out of virtual memory.&lt;/P&gt;

&lt;P&gt;Insufficient stack for a thread can surprisingly be caused by specifying "unlimited" (only one thread can rightfully claim "unlimited").&lt;BR /&gt;
	Insufficient stack can also be caused by not setting a sufficiently large enough stack for use by each of all threads (while not consuming all of virtual memory in combination with code, heap and static data).&lt;BR /&gt;
	Insufficient stack - or rather - insufficient virtual memory can be caused by too small of page file (swap file).&lt;BR /&gt;
	Insufficient stack can be caused by using stack when heap should be used instead.&lt;/P&gt;

&lt;P&gt;An interesting diagnostic may be to declare and define a local variable in WorkshareTest and print out the LOC(localVariable). This will track the stack pointer (of the context of the thread making the call).&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jan 2014 15:48:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Optimizations-dependent-on-availability-of-OpenMP-threads/m-p/952446#M92560</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-01-16T15:48:47Z</dc:date>
    </item>
  </channel>
</rss>

