- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following test program (cut down from a large global climate model) fails with a SIGSEGV error and no traceback information when built with -parallel -O3 (or -O2) but runs correctly with -parallel -O1 (or -O0) on Linux using the Intel Fortran Compiler 11.1.038. I cannot get idb to step into the source file to get any further debug information.
Any help to get idb to step into assel.f90 would be appreciated. -O0 is not useful since this also results in the program running correctly.
Any help to identify whether this is a compiler problem or not would be greatly appreciated.
There is a dependence in the first loop (after commenting out the write*) and the compiler is parallelising thek loop.
A CDEC$ noparallel directive before this loop also fixes the problem. Also, OpenMP directives around this loop also results in the program running correctly (adding -openmp, of course, and private(k,mg).
test1.f90:
**********************************************
program test1
implicit none
integer, parameter :: lat = 49, lat2 = 2*lat
integer, parameter :: lon = 192, ln2 = 2*lon
real pgd(lon,lat2,2)
common /uvpgd/ pgd
call random_number(pgd)
call assel()
end program test1
assel.f90:
*******************************************************
subroutine assel
implicit none
integer, parameter :: lat = 48, lat2 = 2*lat
integer, parameter :: lon = 192, ln2 = 2*lon
integer, parameter :: nl = 18
integer :: k, lgns, ns, lg, mg
real pgd(lon,lat2,2)
common /uvpgd/ pgd
real :: muf_mufm(ln2, nl, lat)
real :: asf
asf = 0.5
print*, "MMRR 100 lat2, lat ", lat2, lat
print*, "MMRR 101 nl, lon, ln2 ", nl, lon, ln2
do lgns = 1, lat2
ns = 1 - (lgns-1) / lat
lg = ns*lgns + (lat2+1-lgns)*(1-ns)
print*, "MMRR 200", lgns, ns, lg
ns = ns * lon
do k = 1, nl
do mg=1,lon
muf_mufm(mg+ns,k,lg)=asf*pgd(mg,lgns,2 )/pgd(mg,lgns,1)
enddo
enddo
enddo ! lgns=1,lat2
print*, "MMRR 500", muf_mufm(1,1,1)
return
end subroutine assel
Makefile:
*******************************************************
FFLAGS = -debug extended -traceback
PFLAGS = $(FFLAGS) -parallel -par-report3
all: test1 test2
test1: test1.f90 Makefile assel.f90
ifort -O0 $(PFLAGS) -c test1.f90
ifort -O2 $(PFLAGS) -c assel.f90
ifort $(PFLAGS) -o test1 test1.o assel.o
test2: test1.f90 Makefile assel.f90
ifort -O3 $(FFLAGS) -o test2 test1.f90 assel.f90
test2 runs correctly and test1 fails giving the following output:
MMRR 200 49 0 48
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libpthread.so.0 00002B19AA9E992B Unknown Unknown Unknown
libiomp5.so 00002B19AA8B42CC Unknown Unknown Unknown
regards
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Again, this bug does not appear in 10.1.022. So a workaround could be to compile assel.f90 with that compiler and everything else with 11.1. Here is a link for getting older versions: http://software.intel.com/en-us/articles/older-version-product/
Could you tell me what code this affects? WRF or another weather code?
ron
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following test program (cut down from a large global climate model) fails with a SIGSEGV error and no traceback information when built with -parallel -O3 (or -O2) but runs correctly with -parallel -O1 (or -O0) on Linux using the Intel Fortran Compiler 11.1.038. I cannot get idb to step into the source file to get any further debug information.
Any help to get idb to step into assel.f90 would be appreciated. -O0 is not useful since this also results in the program running correctly.
Any help to identify whether this is a compiler problem or not would be greatly appreciated.
There is a dependence in the first loop (after commenting out the write*) and the compiler is parallelising thek loop.
A CDEC$ noparallel directive before this loop also fixes the problem. Also, OpenMP directives around this loop also results in the program running correctly (adding -openmp, of course, and private(k,mg).
test1.f90:
**********************************************
program test1
implicit none
integer, parameter :: lat = 49, lat2 = 2*lat
integer, parameter :: lon = 192, ln2 = 2*lon
real pgd(lon,lat2,2)
common /uvpgd/ pgd
call random_number(pgd)
call assel()
end program test1
assel.f90:
*******************************************************
subroutine assel
implicit none
integer, parameter :: lat = 48, lat2 = 2*lat
integer, parameter :: lon = 192, ln2 = 2*lon
integer, parameter :: nl = 18
integer :: k, lgns, ns, lg, mg
real pgd(lon,lat2,2)
common /uvpgd/ pgd
real :: muf_mufm(ln2, nl, lat)
real :: asf
asf = 0.5
print*, "MMRR 100 lat2, lat ", lat2, lat
print*, "MMRR 101 nl, lon, ln2 ", nl, lon, ln2
do lgns = 1, lat2
ns = 1 - (lgns-1) / lat
lg = ns*lgns + (lat2+1-lgns)*(1-ns)
print*, "MMRR 200", lgns, ns, lg
ns = ns * lon
do k = 1, nl
do mg=1,lon
muf_mufm(mg+ns,k,lg)=asf*pgd(mg,lgns,2 )/pgd(mg,lgns,1)
enddo
enddo
enddo ! lgns=1,lat2
print*, "MMRR 500", muf_mufm(1,1,1)
return
end subroutine assel
Makefile:
*******************************************************
FFLAGS = -debug extended -traceback
PFLAGS = $(FFLAGS) -parallel -par-report3
all: test1 test2
test1: test1.f90 Makefile assel.f90
ifort -O0 $(PFLAGS) -c test1.f90
ifort -O2 $(PFLAGS) -c assel.f90
ifort $(PFLAGS) -o test1 test1.o assel.o
test2: test1.f90 Makefile assel.f90
ifort -O3 $(FFLAGS) -o test2 test1.f90 assel.f90
test2 runs correctly and test1 fails giving the following output:
MMRR 200 49 0 48
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libpthread.so.0 00002B19AA9E992B Unknown Unknown Unknown
libiomp5.so 00002B19AA8B42CC Unknown Unknown Unknown
regards
Mike
I have some further information from valgrind that may be of help:
MMRR 200 49 0 48
==16785==
==16785== Invalid write of size 8
==16785== at 0x403CC8: assel_ (in /home/mrezny/tests/mk3/test1)
==16785== by 0x40ACD02: __kmp_invoke_microtask (in /opt/intel/Compiler/11.1/038/lib/intel64/libiomp5.so)
==16785== by 0x408F314: __kmpc_invoke_task_func (in /opt/intel/Compiler/11.1/038/lib/intel64/libiomp5.so)
==16785== by 0x4091A29: __kmp_fork_call (in /opt/intel/Compiler/11.1/038/lib/intel64/libiomp5.so)
==16785== by 0x407C0B8: __kmpc_fork_call (in /opt/intel/Compiler/11.1/038/lib/intel64/libiomp5.so)
==16785== by 0x4039C8: assel_ (in /home/mrezny/tests/mk3/test1)
==16785== by 0x40365E: MAIN__ (in /home/mrezny/tests/mk3/test1)
==16785== by 0x40355B: main (in /home/mrezny/tests/mk3/test1)
==16785== Address 0x9310c0 is not stack'd, malloc'd or (recently) free'd
==16785==
regards
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This does look like a bug in the 11.x versions. I don't see the error in 10.1.022 compiler.
I need to do a little more triage and get a bug report started. Thanks for cutting this down to a small reproducing testcase, it makes it much easier to work with.
more on this shortly.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Again, this bug does not appear in 10.1.022. So a workaround could be to compile assel.f90 with that compiler and everything else with 11.1. Here is a link for getting older versions: http://software.intel.com/en-us/articles/older-version-product/
Could you tell me what code this affects? WRF or another weather code?
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried adding -openmp
Although your code when not using OpenMP specifically, components of OpenMP are being use with the auto parallization -parallel. The idea is to add the OpenMP dependencies.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried adding -openmp
Although your code when not using OpenMP specifically, components of OpenMP are being use with the auto parallization -parallel. The idea is to add the OpenMP dependencies.
If I read the code correctly, the result being calculated is invariant across the loop on k but is being broadcast across a non-unity stride array. -parallel may be part way implementing an optimization based on that. Sometimes, it's safer to write out an optimization explicitly rather than risk the compiler doing it part way.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Again, this bug does not appear in 10.1.022. So a workaround could be to compile assel.f90 with that compiler and everything else with 11.1. Here is a link for getting older versions: http://software.intel.com/en-us/articles/older-version-product/
Could you tell me what code this affects? WRF or another weather code?
ron
Hi Ron,
from what I can remember, 11.0.081 works and the problem started from 11.0.083.
I haven't downloaded and installed the latest release after 11.1.038 although it is now available on our benchmarking machines in the US.
The code is MK3.5 a coupled ocean - climate modeldesignedat CSIRO in Australia.
I have a workaround. I have added OpenMP directives around all 7loop nests in the original code
and compile with -openmp.
regards
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried adding -openmp
Although your code when not using OpenMP specifically, components of OpenMP are being use with the auto parallization -parallel. The idea is to add the OpenMP dependencies.
Jim Dempsey
Hi Jim,
I tried your suggestion but it made no difference.
My soultion has been to put explicit OpenMP directives around all the loop nests and compile with -openmp
regards
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I read the code correctly, the result being calculated is invariant across the loop on k but is being broadcast across a non-unity stride array. -parallel may be part way implementing an optimization based on that. Sometimes, it's safer to write out an optimization explicitly rather than risk the compiler doing it part way.
Hi Tim,
yes I suspect that there is some interaction between some HLO and the fact that the prarlellizer had to reject the first loop due to dependencies and chose to parallelize the second loop.
My mission, should I choose to accept it, will be to understand what this loop nest is really trying to achieve and restructure the code to remove the dependency.
regards
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mike,
Another thing you might experiment with (assuming you have the time and inclination).
When compiling for OpenMP, subroutines and functions "inherit" RECURSIVE. Try explicitly adding RECURSIVE to the declaration of your subroutine. If that fails, then revert to the OpenMP thing.
Good luck,
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sometimes, when I've had similar problems with earlier versions of IVF if I simplify the loop, the optimizer doesn't goof up.
Try copying the codeof the two inner loops (5 lines) and placing into a subroutine, then calling the subroutine. The run length of the two inner loops is sufficiently larger than the call overhead so I do not thing the overhead would be too significant.
You might find this straitensout the error, and then the optimization will inline the subroutine and eliminate the call overhead.
Jim

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page