- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to get an ocean model to run on a SGI Altix 3700 BX2 system running Suse Linux. I have installed versions 10 and 11 of ifort as well as gfortran 4.3.3. The problem is that the program crashes with the following backtrace when I compile it with either version of the ifort compiler:
(idb) bt
#0 0xa000000000010621 in __kernel_syscall_via_break () in [vdso]
#1 0x20000000005901c0 in raise () in /lib/libc-2.4.so
#2 0x2000000000592b60 in abort () in /lib/libc-2.4.so
#3 0x2000000000481300 in __kmp_do_abort () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#4 0x2000000000473930 in __kmp_wait_sleep () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#5 0x2000000000472f70 in __kmp_barrier () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#6 0x200000000044ac30 in __kmpc_barrier () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#7 0x4000000000008a30 in step_thread () at /home/cousins/PWS-altix/temp/step.f:369
#8 0x4000000000008560 in step () at /home/cousins/PWS-altix/temp/step.f:3
I am currently compiling this with:
ifort -c -O3 -w90 -w95 -cm -72 -fno-alias -i4 -r8 -Vaxlib -heap-arrays -auto -noalign -openmp -nothreads
Everything from -heap-arrays on is just based on Googling. I have also done:
export KMP_STACKSIZE=209715200
ulimit -s unlimited
export OMP_NUM_THREADS=16
before running the program. I have also tried -O0 and different numbers of threads from 4 to 64. The code does run in serial mode without OpenMP.
The code runs fine with the gfortran compiler and it also runs fine with both ifort 10 and 11 on Xeon and Opteron machines running Fedora.
Is there a known problem with ifort and OpenMP on the Itanium2? Something in the Suse environment? Some added switch that I should be using for compilation?
Thanks,
Steve
(idb) bt
#0 0xa000000000010621 in __kernel_syscall_via_break () in [vdso]
#1 0x20000000005901c0 in raise () in /lib/libc-2.4.so
#2 0x2000000000592b60 in abort () in /lib/libc-2.4.so
#3 0x2000000000481300 in __kmp_do_abort () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#4 0x2000000000473930 in __kmp_wait_sleep () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#5 0x2000000000472f70 in __kmp_barrier () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#6 0x200000000044ac30 in __kmpc_barrier () in /opt/intel/Compiler/11.0/083/lib/ia64/libiomp5.so
#7 0x4000000000008a30 in step_thread () at /home/cousins/PWS-altix/temp/step.f:369
#8 0x4000000000008560 in step () at /home/cousins/PWS-altix/temp/step.f:3
I am currently compiling this with:
ifort -c -O3 -w90 -w95 -cm -72 -fno-alias -i4 -r8 -Vaxlib -heap-arrays -auto -noalign -openmp -nothreads
Everything from -heap-arrays on is just based on Googling. I have also done:
export KMP_STACKSIZE=209715200
ulimit -s unlimited
export OMP_NUM_THREADS=16
before running the program. I have also tried -O0 and different numbers of threads from 4 to 64. The code does run in serial mode without OpenMP.
The code runs fine with the gfortran compiler and it also runs fine with both ifort 10 and 11 on Xeon and Opteron machines running Fedora.
Is there a known problem with ifort and OpenMP on the Itanium2? Something in the Suse environment? Some added switch that I should be using for compilation?
Thanks,
Steve
Link Copied
9 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's been said that -heap-arrays may be incompatible with -openmp. You could try "-heap-arrays n", if you can avoid any allocations less than n in the parallel regions. -nothreads and -noalign don't make sense to me, particularly if you have no reason to use them other than some Googled recommendation of unknown heritage.
You have several options there which ought to be redundant; I'm a little more comfortable in leaving them off.
-auto might be OK as it is implied by -openmp, and would assure you of similar treatment of local arrays when not setting -openmp.
You have several options there which ought to be redundant; I'm a little more comfortable in leaving them off.
-auto might be OK as it is implied by -openmp, and would assure you of similar treatment of local arrays when not setting -openmp.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
It's been said that -heap-arrays may be incompatible with -openmp. You could try "-heap-arrays n", if you can avoid any allocations less than n in the parallel regions. -nothreads and -noalign don't make sense to me, particularly if you have no reason to use them other than some Googled recommendation of unknown heritage.
You have several options there which ought to be redundant; I'm a little more comfortable in leaving them off.
-auto might be OK as it is implied by -openmp, and would assure you of similar treatment of local arrays when not setting -openmp.
You have several options there which ought to be redundant; I'm a little more comfortable in leaving them off.
-auto might be OK as it is implied by -openmp, and would assure you of similar treatment of local arrays when not setting -openmp.
Hi Tim,
The original compile line was:
ifort -c -O3 -w90 -w95 -cm -72 -fno-alias -i4 -r8 -Vaxlib
I probably should have just used that in the email since that is what works on the other machines.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Cousins
ifort -c -O3 -w90 -w95 -cm -72 -fno-alias -i4 -r8 -Vaxlib
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
Even the obsolete options there should be OK, provided that you use -r8 consistently, or turn on interface checking so as to ensure consistency.
Just a correction to what I said before about what switches are being used. -openmp is also being used (of course).
As for -r8 being used consistently, it is. The program is created with a script that prepares everything and then runs "make". The compile flags are supplied in only two places (one for modules and one for other files) and these are consistent. As indicated before, this program compiles and runs fine on other architectures (Xeon and Opteron) with the ifort compiler and it even runs on the Itanium2 with the gfortran compiler.
What I am wondering about is if there are any known problems (and fixes hopefully) with the ifort compiler on Itanium2 systems when using OpenMP.
Thanks,
Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Cousins
Just a correction to what I said before about what switches are being used. -openmp is also being used (of course).
As for -r8 being used consistently, it is. The program is created with a script that prepares everything and then runs "make". The compile flags are supplied in only two places (one for modules and one for other files) and these are consistent. As indicated before, this program compiles and runs fine on other architectures (Xeon and Opteron) with the ifort compiler and it even runs on the Itanium2 with the gfortran compiler.
What I am wondering about is if there are any known problems (and fixes hopefully) with the ifort compiler on Itanium2 systems when using OpenMP.
Thanks,
Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - eliosh
From my (very limited) experience OpenMP programs crash because they try to allocate arrays on stack that is not large enough. This situation is easily identified: the subroutine crashes the program even before the first executable statement is run. There are two solutions. First, set a bigger stack. Second (seems much better to me) use allocatable arrays.
Thanks for the suggestion. I'm specifying:
export KMP_STACKSIZE=209715200
ulimit -s unlimited
Is there something else that I should specify do you think? Incidently, on the Opteron and Xeon systems (both Fedora) the ulimit stack size is
stack size (kbytes, -s) 10240
The default on the Suse Altix machine is 8192 but I thought setting it to "unlimited" would make it a non-issue. I just gave it a try just specifying it to 10240 like the others.
Using allocatable arrays isn't out of the question but this code has been around for quite a while with a number of users and I don't think it would be good to make major changes. The code itself doesn't seem to be a problem. It runs on other systems including other Altix systems, and even on this Altix system with the default stack size when using the gfortran compiler. There is something about ifort on this system with OpenMP that is not right.
Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Steve Cousins
The default on the Suse Altix machine is 8192 but I thought setting it to "unlimited" would make it a non-issue. I just gave it a try just specifying it to 10240 like the others.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
Quoting - Steve Cousins
The default on the Suse Altix machine is 8192 but I thought setting it to "unlimited" would make it a non-issue. I just gave it a try just specifying it to 10240 like the others.
Yes. It did raise it to 10240. I couldn't raise it above this number but it did get set to that value:
cousins@Altix-10G:~> ulimit -s
8192
cousins@Altix-10G:~> ulimit -s 10240
cousins@Altix-10G:~> ulimit -s
10240
Steve
Note: after I sent this I did more checking because I remember being able to set it to unlimited previously. It turns out that in a given instance of a bash shell you can set it once and then after that you can only reduce it. Starting a new shell allows me to set it to any value. I tried again at 20480 with the same problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I noticed Steve Lionel's post about 11.1 being out so I gave it a try. It works now. That is fast work! ;^)
Further note. It works with -O2 but not with -O3. It crashes with NaN's when using -O3. I'll take it.
Further note. It works with -O2 but not with -O3. It crashes with NaN's when using -O3. I'll take it.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page