Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
48 Views

Debugging: forrtl: severe (174): SIGSEGV, segmentation fault occurred

 

Hi,

I'm trying to debug a segmentation fault error using information from this forum.

When I compile my ocean model (using ifort (IFORT) 16.0.1 20151021) with the following options

-u -O2 -fltconsistency -shared-intel -mcmodel=medium -heap-arrays

and even if I set ulimit -s unlimited, I get the following error

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source      
       
libintlc.so.5      00002B508BED39B5  Unknown               Unknown  Unknown
libintlc.so.5      00002B508BED1777  Unknown               Unknown  Unknown
libifcore.so.5     00002B508A873872  Unknown               Unknown  Unknown
libifcore.so.5     00002B508A8736C6  Unknown               Unknown  Unknown
libifcore.so.5     00002B508A7CC795  Unknown               Unknown  Unknown
libifcore.so.5     00002B508A7DE5DD  Unknown               Unknown  Unknown
libpthread.so.0    0000003A3220F4A0  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  000000000043EED8  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  00000000004047D1  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  0000000000403070  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  000000000040212E  Unknown               Unknown  Unknown
libc.so.6          0000003A31A1ECDD  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  0000000000402039  Unknown               Unknown  Unknown

However, if I try to add debugging and traceback to isolate the error

-u -O2 -fltconsistency -shared-intel -mcmodel=medium -heap-arrays-g -traceback -check all -fp-stack-check

Then the code runs without segmentation faults.

What other things can I try to isolate this segmentation fault.

Thanks

0 Kudos
3 Replies
Highlighted
48 Views

Try adding just -traceback.

Try adding just -traceback.

Retired 12/31/2016
0 Kudos
Highlighted
Beginner
48 Views

 

 

Hi Steve,

That helped, giving me a line number

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
libintlc.so.5      00002AD0140BE9B5  Unknown               Unknown  Unknown
libintlc.so.5      00002AD0140BC777  Unknown               Unknown  Unknown
libifcore.so.5     00002AD012A5E872  Unknown               Unknown  Unknown
libifcore.so.5     00002AD012A5E6C6  Unknown               Unknown  Unknown
libifcore.so.5     00002AD0129B7795  Unknown               Unknown  Unknown
libifcore.so.5     00002AD0129C95DD  Unknown               Unknown  Unknown
libpthread.so.0    000000326580F4A0  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  000000000043F158  diag_                    1921  diag.f
pe_PB_sar25in_tid  00000000004047DF  step_                     596  step.f
pe_PB_sar25in_tid  0000000000403070  MAIN__                    837  ocean.f
pe_PB_sar25in_tid  000000000040212E  Unknown               Unknown  Unknown
libc.so.6          000000326501ECDD  Unknown               Unknown  Unknown
pe_PB_sar25in_tid  0000000000402039  Unknown               Unknown  Unknown

Two strange things remain.  First the line number is the top line of a do loop

          do 110 k=0,km

not where I normally expect a segmentation fault. All these variables exist and are declared.  I then added some write statements to make sure the values made sense

        do 120 ll=1,mterms
          engext(ll)=c0
          do 100 i=1,imt
            zuseng(i,ll)=c0
            zvseng(i,ll)=c0
 100      continue
          write (6,*) 'll=',ll,'  km=',km
          call flush (6)
          do 110 k=0,km
            write (6,*) 'k=',k
            call flush (6)
            engint(k,ll)=c0
            termbm(k,ll,1)=c0
            termbm(k,ll,2)=c0
 110      continue
 120    continue


However, adding these write statements also remove the segmentation fault.  I'm not sure what to do next. Any thoughts?

 

0 Kudos
Highlighted
48 Views

With symptoms like this I

With symptoms like this I often find it's due to memory corruption - writing outside the declared space for a variable. Unfortunately, this might have occurred much earlier in the program. Anything that disturbs the compiler's choice of memory layout can make such errors appear or disappear. Also, adding write statements can disable some optimizations that could change the behavior.

Probably the first thing I would do is build with "-warn interface" to see if you have any errors in routine calls. See if removing options such as -fltconsistency (that's a very old option, superseded by -fp-model) or -heap-arrays changes the behavior. Try dropping the optimization level to 1 or 0 and see what it does.

Next I would run the program under gdb and determine which instruction was getting the segfault. Then it's a bit of a slog to figure out where its input addresses came from - being able to read assembly code helps.

You could also try seeing if compiling all the other sources with -O0 but just this one with -O2 preserves the error. See if you can find the (hopefully small) combination of sources that need -O2 to still show the error. Sometimes this approach can help you identify the real culprit in another source file.

Retired 12/31/2016
0 Kudos