- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I was trying to debug a fortran code , executable was compiled with intel v17 on opensuse on intel xeon E5-2690. The code was experiencing segfault generated within a function (getgb2). Though there are 11 .f90 files, i am sharing the compilation flags for mod_grib2io.f90.
... ifort -free -O3 -msse2 -convert big_endian -DLINUX -fp-model precise -assume byterecl -I../../libs/src/ofs_mods -I../../../hwrf-utilities/libs/mods/g2 -c mod_grib2io.f90 ... ftn -Wl,-noinhibit-exec -o ../../exec/hwrf_gfs2ofs2 flush.o constants.o horiz_interp.o mod_hytime.o mod_flags.o mod_hycomio1.o mod_dump.o mod_grib2io.o mod_geom.o intp.o cd.o -L../../libs -lofs_mods -L../../../hwrf-utilities/libs/ -lg2 -lw3nco_i4r4 -lw3_i4r4 -lbacio -L/usr/lib64 -ljasper -lpng -lz
on running hwrf_gfs2ofs2 i got -
--- Changing MRF mask 2 2 exhycom2d size 2 2 eyhycom2d size ismus: mask for the ismus correction is ismus_msk1440x720.dat ismus: MRF mask is corrected for i,j= 2 1 ---------- output from horiz_interp ---------- input: min= 0.000000000 max= 1.000000000 avg= 0.287844061851501 number of missing points = 0 output: min= 0.000000000 max= 1.000000000 avg= 0.477386802434921 number of missing points = 0 +++++ # of interations for land/sea mask extrapolation is nextrap= 2 MRF fluxes: i,min,max= 6 -26.95802 24.18198 forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source hwrf_gfs2ofs2 00000000004720C4 Unknown Unknown Unknown libpthread-2.22.s 00002B99B1F77B10 Unknown Unknown Unknown libc-2.22.so 00002B99B21FF9B4 cfree Unknown Unknown hwrf_gfs2ofs2 00000000004B38A8 Unknown Unknown Unknown hwrf_gfs2ofs2 000000000045732A Unknown Unknown Unknown hwrf_gfs2ofs2 0000000000456DB7 Unknown Unknown Unknown hwrf_gfs2ofs2 0000000000414755 mod_grib2io_mp_rd 102 mod_grib2io.f90 hwrf_gfs2ofs2 00000000004337A2 MAIN__ 464 intp.f90 hwrf_gfs2ofs2 000000000040391E Unknown Unknown Unknown libc-2.22.so 00002B99B21A46E5 __libc_start_main Unknown Unknown hwrf_gfs2ofs2 0000000000403829 Unknown Unknown Unknown
Tried gdb, but i felt i need to dig into the getgb2 function (available in libg2.a - compiled without -g flag). Before trying to dig into, i tried checking the value of some of variables being passed within getgb2 function. To my surprise, the print statements fixed the issue which i was getting. This solution seems very patchy, here is the "new" stdout (where the original code was crashing ).
......... 331 241 240 0.16934 0.10809 --- Changing MRF mask 2 2 exhycom2d size 2 2 eyhycom2d size COMMENT|mod_grib2io.f90:before getgb2############### COMMENT|lugb: 82 ,lugi: 83 ,jskp: 0 ,jdisc: 2 COMMENT|mod_grib2io.f90:after getgb2############### COMMENT|lugb: 82 ,lugi: 83 ,jskp: 21 ,jdisc: 2 ismus: mask for the ismus correction is ismus_msk1440x720.dat ismus: MRF mask is corrected for i,j= 2 1 ---------- output from horiz_interp ---------- input: min= 0.000000000 max= 1.000000000 avg= 0.287844061851501 number of missing points = 0 output: min= 0.000000000 max= 1.000000000 avg= 0.477386802434921 number of missing points = 0 +++++ # of interations for land/sea mask extrapolation is nextrap= 2 COMMENT|mod_grib2io.f90:before getgb2############### COMMENT|lugb: 82 ,lugi: 83 ,jskp: 0 ,jdisc: 0 COMMENT|mod_grib2io.f90:after getgb2############### COMMENT|lugb: 82 ,lugi: 83 ,jskp: 7 ,jdisc: 0 MRF fluxes: i,min,max= 6 -26.95802 24.18198 COMMENT|mod_grib2io.f90:before getgb2############### COMMENT|lugb: 82 ,lugi: 83 ,jskp: 0 ,jdisc: 0 COMMENT|mod_grib2io.f90:after getgb2############### COMMENT|lugb: 82 ,lugi: 83 ,jskp: 8 ,jdisc: ...
Here is the code section with modifications -
print *,'COMMENT|mod_grib2io.f90:before getgb2###############' print *,'COMMENT|lugb:',lugb,',lugi:',lugi,',jskp:',jskp,',jdisc:',jdisc ! print *,',jids:',jids,',jpdtn:',jpdtn,',jpdt:',jpdt,',jgdtn:',jgdtn ! print *,',jgdt:',jgdt,',jskp:',jskp !,',gfld:',gfld ! print *,',iret:',iret ! print *,'mod_grib2io.f90:############################' call getgb2(lugb,lugi,jskp,jdisc,jids,jpdtn,jpdt,jgdtn,jgdt, & unpack,jskp,gfld,iret) print *,'COMMENT|mod_grib2io.f90:after getgb2###############' print *,'COMMENT|lugb:',lugb,',lugi:',lugi,',jskp:',jskp,',jdisc:',jdisc ! print *,',jids:',jids,',jpdtn:',jpdtn,',jpdt:',jpdt,',jgdtn:',jgdtn ! print *,',jgdt:',jgdt,',jskp',jskp !,',gfld',gfld ! print *,',iret:',iret ! print *,'mod_grib2io.f90:############################'
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There is definitely not enough information to say something reasonable on the problem. This looks a quite old code that has been partially adapted to Fortran 90/95 standard, as it is a minimal module wrapper code around something that looks very much like fixed form F77 code. With one exception there are no intents on the dummy arguments of the subroutines. A reasonable advice would be to switch on all bounds checking, checking for uninitialised variables etc. etc. Also maybe switch off all the different optimisation flags like -O3 -sse2 etc. and see of the problem is gone then.Compile your program with debug flags (-g -O0). What I can see is that you are opening files, inquiring on units etc. So the segmentation fault could be related to an illegal operation on one fo those units, particularly given the fact that you are apparently checking the "open" status with a global logical array.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, the addition of a PRINT statement is as effective a remedy for segfaults as a placebo is a remedy for headaches.
That the addition of a benign PRINT statement seemed to fix the segfault is an indication that the code has a bug that is likely to hide and could be quite hard to locate and fix. It may happen that the bug will emerge from hiding and give you a bad bite when you simply rerun the program with slightly different data or on a different PC.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page