Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29277 Discussions

odd Seg fault location in Large program

deadpickle
Beginner
1,035 Views
First of all I apologize for this post. The code here is part of an 800+ line program and I am not going to post the whole thing. Also, usually I would try to isolate the error but for some reason it does not happen in the runs before the error appears. IOW since this error is within a DO loop, the section of code that seems to be producing the error has been repeated 15+ times before showing up.

There are print statements in the code that I used to figure out how far the program got before the error happens. The seg fault and prints were reported as such:
HERE4
HERE2
HERE7
HERE8
HERE9
HERE10
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
thor1 080643B7 track_module_mp_t 738 track0.1.f90
thor1 08050248 MAIN__ 185 thor0.52.f90
thor1 0804ECD1 Unknown Unknown Unknown
libc.so.6 00978E8C Unknown Unknown Unknown
thor1 0804EBC1 Unknown Unknown Unknown

The line pointed to by traceback is label below. The odd thing I see is that the prints make it to 10 which is far away and outside the statement where the seg fault occurs. The statement where the seg fault occurs is contained within a DO loop that ends and is not returned to until much later. Even then It would print a number lower than 10. I am not sure how to find out what is going on and how to fix it. I hope this made sense. I guess in summation the seg fault occued inside a do loop while the program seemed to be outside that do loop.
Any ideas? How can I figure out what is wrong with the program? Would a debugger be helpful?

[Main Program]

DO q=1,tracksubset
print *, "HERE1"
i = 0
r = 0
DO
i = i + 1
r = r + 1
print *, "HERE2"
IF (r > maxslices .OR. i > maxcellsintrack) EXIT
IF (temptrack(q,i,r,1) /= 0.0) THEN <--- Seg Fault Error Here
stormmotion(q) = temptrack(q,1,r,2)/temptrack(q,1,r,3)
print *, "HERE3"
ELSE
print *, "HERE4"
i = i - 1
END IF
END DO

IF (q > 1) THEN
print *, "HERE5"

IF (stormmotion(q) <= stormmotion(tracksubset+1)) THEN
stormmotion(tracksubset+1) = stormmotion(q)
stormmotion(tracksubset+2) = q
print *, "HERE6"
END IF
ELSE
stormmotion(tracksubset+1) = stormmotion(q)
stormmotion(tracksubset+2) = q
print *, "HERE7"
END IF

print *, "HERE8"
END DO

print *, "HERE9"

IF (stormmotion(tracksubset+1) /= 0.0) THEN
WRITE(30,*) "Track",track
WRITE(30,*) "MeanError",stormmotion(tracksubset+1)
t = 0
s = 0
DO
t = t + 1
s = s + 1
IF (s > maxslices .OR. t > maxcellsintrack) EXIT
IF (temptrack(INT(stormmotion(tracksubset+2)),t,s,1) /= 0.0) THEN cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),9) = 1
CALL epoch_to_time (INT(cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),1)),slicetime)
WRITE(30,*) slicetime(1:13),INT(temptrack(INT(stormmotion(tracksubset+2)),t,s,1)),cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),2),cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),3),INT(cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),4)),cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),5),cellinfo(s,temptrack(INT(stormmotion(tracksubset+2)),t,s,1),6)

ELSE
t = t - 1
END IF
END DO

ELSE
print *, "HERE10"
track = track - 1
END IF

print *, "HERE11"

if (allocated(stormmotion)) Deallocate(stormmotion)
if (allocated(temptrack)) Deallocate(temptrack)
close(40)

[Closing Arguments]

0 Kudos
1 Solution
Kevin_D_Intel
Employee
1,035 Views

Ok. Next steps then. We need specifics about the OS (Linux or Mac OS X?), your architecture, and compiler version (-V).

I didn't catch this earlier. The Segv occurs inside the Fortran RTLs, which trapped the error.The earlier article contains remedies for faults within user code.

We need to see the code since it is likely we'll need our Fortran RTL developers help.

This is a public forum so if you are not comfortable uploading the code/input data to this post, then open a newIntel Premier issue (here) and request it be assigned to me. I'll gladly dig into this.

If you are not already using our latest 11.1 release, there's perhaps a chance a newer compiler may offer a remedy too.

View solution in original post

0 Kudos
7 Replies
Kevin_D_Intel
Employee
1,035 Views

Possibly relates to stack temps. Have a look at Ron's advice in this Knowledge Base article (here). Probably -heap-arrays may help or up-ing the shell stack limit.
0 Kudos
deadpickle
Beginner
1,035 Views

Possibly relates to stack temps. Have a look at Ron's advice in this Knowledge Base article (here). Probably -heap-arrays may help or up-ing the shell stack limit.
Thanks for the advice!

I went ahead and tried:
  • -check bounds
  • I already use -traceback
  • -heap-arrays
  • unlimit stacksize
  • -check arg_temp_created
None of these have shown anything new, still getting the same seg fault with the same location. I am reluctant to send this program in since its size is very large. To compile the program you need 10 modules and a series of input data. So it seems unreasonable for me to send it in as the last tip hints at. Would using a debugger shine any light on this problem or have I gotten all I can from this? I would send it if I had to but it would take a little while.
0 Kudos
Kevin_D_Intel
Employee
1,036 Views

Ok. Next steps then. We need specifics about the OS (Linux or Mac OS X?), your architecture, and compiler version (-V).

I didn't catch this earlier. The Segv occurs inside the Fortran RTLs, which trapped the error.The earlier article contains remedies for faults within user code.

We need to see the code since it is likely we'll need our Fortran RTL developers help.

This is a public forum so if you are not comfortable uploading the code/input data to this post, then open a newIntel Premier issue (here) and request it be assigned to me. I'll gladly dig into this.

If you are not already using our latest 11.1 release, there's perhaps a chance a newer compiler may offer a remedy too.
0 Kudos
deadpickle
Beginner
1,035 Views

Ok. Next steps then. We need specifics about the OS (Linux or Mac OS X?), your architecture, and compiler version (-V).

I didn't catch this earlier. The Segv occurs inside the Fortran RTLs, which trapped the error.The earlier article contains remedies for faults within user code.

We need to see the code since it is likely we'll need our Fortran RTL developers help.

This is a public forum so if you are not comfortable uploading the code/input data to this post, then open a newIntel Premier issue (here) and request it be assigned to me. I'll gladly dig into this.

If you are not already using our latest 11.1 release, there's perhaps a chance a newer compiler may offer a remedy too.
The program is being ran on CentOS with
[thor@updraft ~]$ ifort -V
Intel Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20090630 Package ID: l_cprof_p_11.1.046
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.
FOR NON-COMMERCIAL USE ONLY

Because of the nature of this program I will have to send it to you in the Premier support. Though I am having problems logging into the premier support. Is the login provide for this forum what I should use to login with in the Premier section? As soon as I can get in I will send it to you along with some intructions.
0 Kudos
TimP
Honored Contributor III
1,035 Views
The premier login ID is the same as the one for registrationcenter.intel.com. If you didn't register your software and thus create an account, you can do it at registrationcenter.
0 Kudos
Mike_Rezny
Novice
1,035 Views
Quoting - deadpickle
The program is being ran on CentOS with
[thor@updraft ~]$ ifort -V
Intel Fortran Compiler Professional for applications running on IA-32, Version 11.1 Build 20090630 Package ID: l_cprof_p_11.1.046
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.
FOR NON-COMMERCIAL USE ONLY

Because of the nature of this program I will have to send it to you in the Premier support. Though I am having problems logging into the premier support. Is the login provide for this forum what I should use to login with in the Premier section? As soon as I can get in I will send it to you along with some intructions.

Hi,
What compiler options are you using?

Here is how I would tackle such a problem:

If it fails with -O0 (you get this by default with -g) then trust your print statements.
Once you have narrowed the problem down toa specific loop, put more print statements within the loop.

Yes, that will create a huge amount of output, but don't worry, just redirect the output to a file and then
look at it with an editor.

I have sometimes had success using valgrind in these sorts of problems.

One suggestion is to compile with -g and run the program under idb (or idbc if you prefer the command line version). It should fall over and then give you the opportunity to look at the loop index and array variables around the problem area.

I read that you had tried check bounds, but did you do a check all ?
These problems can also be caused by inconsistent passing of array parameters.

happy hunting

regards
Mike
0 Kudos
Kevin_D_Intel
Employee
1,035 Views
Quoting - deadpickle
Because of the nature of this program I will have to send it to you in the Premier support. Though I am having problems logging into the premier support. Is the login provide for this forum what I should use to login with in the Premier section? As soon as I can get in I will send it to you along with some intructions.

Send me an email (kevin.d.davis@intel.com) and I will forward alternative instructions to provide the code w/o using Premier.
0 Kudos
Reply