Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

OpenMP and stack size

fivos
Beginner
2,979 Views
Hi to everyone,

I have a problem with an algorithm which uses for parallelization openMP directives. This algorithm has been tested on the intel fortran compiler 11.0.083for linux, where, on my pc with Ubuntu 8.10 64-bit, it gives a segmentation fault, unless the ulimit -s unlimited command is typed. After that the algorithm run as it should giving the expected results.
(the algorithm has also been tested on a cluster with red hat linux, where it didn't produce any segmentation faults)

Now I want to build a windows application from the same source code using the intel fortran compiler 11.0.061 for windows. As in Ubuntu linux, when I try to run the executable I get a Program exception : stack overflow error, obviously because of the stack size limit.

So :
1st) I don't know if there is an equivalent command of ulimit -s unlimited for windows
2nd) I havetried to set the stack size of the application by setting the environment variable OMP_STACKSIZE according to the Intel fortran User Guide, which gives an example syntax:

export OMP_STACKSIZE=value

However when I use the above syntax the problem persists, not to mention that the suffix (M, K, etc.)is not recognised.

How can I set the stack size either from windows or the application itself?
I use windows XP sp3.

Thanks in advance
0 Kudos
12 Replies
François-Xavier
Beginner
2,979 Views
Dear fivos,

Have you tried specifying in the Link options of your program the stack size using the /STACK flag?

Sincerely yours,

F-Xavier
0 Kudos
fivos
Beginner
2,979 Views
Dear Francois-Xavier,

First of all I would like to thank you for your quick response.Now could you be a bit more specific with the /STACK flag?

I have to say that I am using the intel fortran compiler integrated in the Microsoft Visual studio 2008, so I am accesing the compiler options through the project properties.
If I go to:project properties -> linker -> system there are two options for altering the stack :stack reserve size and stack commit size (in the explanation it shows the /STACK command you told me),but after I input avalue of 100000 (I assume that the units are KiloBytes, but I have tried even larger numbers)in both fields the same stack overflow problem occurs.

Any ideas on that?
0 Kudos
François-Xavier
Beginner
2,979 Views
Quoting - fivos
Dear Francois-Xavier,

First of all I would like to thank you for your quick response.Now could you be a bit more specific with the /STACK flag?

I have to say that I am using the intel fortran compiler integrated in the Microsoft Visual studio 2008, so I am accesing the compiler options through the project properties.
If I go to:project properties -> linker -> system there are two options for altering the stack :stack reserve size and stack commit size (in the explanation it shows the /STACK command you told me),but after I input avalue of 100000 (I assume that the units are KiloBytes, but I have tried even larger numbers)in both fields the same stack overflow problem occurs.

Any ideas on that?

Dear fivos,

You did it right according to me but be careful that the units are in bytes and not in kiloBytes if my memory serves me right.

Sincerely yours,

F-Xavier
0 Kudos
fivos
Beginner
2,979 Views

You must be right the stack is in bytes not in KB. I increased the number of the stack size from 100000 to 250000000 (~250MB), now the application starts, but soon crashes giving the two errors :

OMP : Error #144 : Cannot create thread
OMP : System error #8: not enough storage is available to process this command.

I can' t understand why I am unable to run the application, since:
- I have enough memory on thePC I am using (6GB, but windows are 32 - bit so they detect ~4GB)
- On linux the application runs normaly
- Apart from the fortran, I have coded the same algorithm in C++ using the same loops, if-statements, private variables for the OpenMP parallelization and the application runs without even altering the stack size.

AnywayFrancois thank you for your help.

0 Kudos
TimP
Honored Contributor III
2,979 Views
If you didn't set the boot.ini switch /3GB, the address range available to the program in 32-bit XP would be far less than even 32-bit linux. 250MB stack would use up a large part of that. Which OpenMP library did you use with Windows C++ ?
0 Kudos
fivos
Beginner
2,979 Views
Dear Tim,

The openMP library Iused for the C++ application was the one provided with the Microsoft Developer Studio 2008 (OpenMP 2.0 standard). In order to have support for the OpenMP directives I included the omp.h file and enabled the OpenMP support through the language submenu of the C++ compiler. When I launch the application I can see that it runs in parallel, because it runs much faster and consumes much more CPU resources than when I do not enable the OpenMP directives (Of course I have done some simple tests where each thread prints its thread number).

How can I see the boot.ini setting you mentioned?
0 Kudos
jimdempseyatthecove
Honored Contributor III
2,979 Views

Try reducing your stack requirement by using

/heap-arrays[:size]

Where size is a threshold in KB

See -heap-arrays compiler option

Remember to reduce the stack size.

Jim Dempsey
0 Kudos
fivos
Beginner
2,979 Views
Dear Jim Dempsey,

I have justtriedthe /heap-arrays command, but it didn't work. I added the /heap-arrays[size] in the command line for the fortran compiler, for several sizessuch as 10,100,1000 KB (I even removed the [size] option, which makes all automatic/temporary arrays to be stored at the heap), but still I get the forrtl (170):stack overflow, if I keep the stack size small, and If I increase the stack size, I get the OMP #144 error : cannot create thread and #8 error that I dont have enough storage.
0 Kudos
jimdempseyatthecove
Honored Contributor III
2,979 Views

Therefor something has to account for excessive stack consumption.

If I were to guess you have a recursive algorithm that is progressing forever deeper.

Or, you have a mixed language app and incorrect calling conventions (i.e. one assumes other is responsible for cleaning up stack when not the case).

You can insert some diagnostics (conditioned to compile in Debug configuration) to check for number of recursion levels. A thread private nest level counter in suspected subroutine would do.

Jim Dempsey
0 Kudos
TimP
Honored Contributor III
2,979 Views
Quoting - fivos


How can I see the boot.ini setting you mentioned?
http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx

There is another MSDN article about boot.ini /3GB which says it doesn't apply to XP Pro, but as far as I can see it does. I suppose they are deleting references to XP Pro on account of withdrawing support. By a similar token, Intel tools get full testing only on currently supported Windows Server versions, due in part to XP Pro not supporting a test harness.
If you wished to check whether your problem is associated with library or compiler, you could run your MSVC++ built .exe with the libiomp5.
0 Kudos
fivos
Beginner
2,979 Views
Well I agree that the stack consumption is indeed excessive, however I am not usingany recursion at all in my algorithm. Let me explain the procedure of the algorithm:

First there is an initialization part, assigning initial values in arrays and variables. Next there is a big loop (time loop) which involves the whole computation part and is the rest of the algorithm. Inside this timeloop there are three major loops, which are the mostcomputationally intensiveareas and thus are parallelized using a parallel do (or parallel for) constructs. I have declared the intermediate results stored in temporary variablesas private variables in these loops. To be more specific the private variables are : 25 in the first loop, 29 in the second and 39 in the last (they are double precision and integer variables not arrays). I am not using any mixed language, but I have built the same algorithm usingC++ andfortran, where C++ didnt gave any issues with the stack size, when parallelized.

What types of diagnostics could I use to find what is going wrong?



0 Kudos
jimdempseyatthecove
Honored Contributor III
2,979 Views

I would suggest inserting diagnostic

write(*,*) "subroutine name here", omp_get_thread_num()

If you enable the preprocessor you can use __FILE__ and __LINE__

This will trace what happens and may identify the source of the problem

One you identify some information about the time and position of the failure, you can then set a/some break points and examine closer as to what is causing the failure.

A typographical error could get you going into a recursive rabbit hole with no exit.

Jim Dempsey


0 Kudos
Reply