Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Access violation reading location 0x00000000

Rob1
Beginner
7,862 Views

The full error i'm getting is Unhandled exception at 0x0162EF4C in MAIN.exe: 0xC0000005: Access violation reading location 0x00000000

I think this is just a symptom of an underlying problem.  I've been struggling with these types of issues off and on for many years and have never figured out for sure why they happen.

I get this error only with a release build with the following settings:

/nologo /debug:full /O3 /Qipo /fpp /I"C:\Program Files (x86)\Intel\Composer XE\mkl\include\ia32" /I"C:\Program Files (x86)\MATLAB\R2010b\extern\include" /warn:all /debug-parameters:all /fp:fast=2 /Qfp-stack-check /module:"Release\INTERMEDIATE\\" /object:"Release\INTERMEDIATE\\" /Fd"Release\INTERMEDIATE\vc110.pdb" /check:uninit /libs:static /threads /Qmkl:sequential /c

When i run the debug build the program seems to run correctly.

If i throw a simple write statement:

write(*,*) 'hello'

before the line that this error occurred on, the program will run without throwing the error.  If i change the build settings to /O2 I can get it to run but produce incorrect results.

I ran the program in inspector, and i get a critical item "Unhandled application exception" and a "Invalid memory access" on the same line, but i don't see anything obvious that points to a cause for this error.

In my experience with these issues, i've seen lots of strange behavior for example a simple assignment statement where after execution the left hand side doesn't equal the right hand side, but then if the same statement is repeated the value will "stick".

I have no c functions being called (i had thought these were the problem previously).

Has anyone else run into this before?

 

rob

0 Kudos
25 Replies
Steven_L_Intel1
Employee
7,257 Views

I suggest you start by reading Don't Touch Me There - What error 157 (Access Violation) is trying to tell you

That the behavior changes by adding I/O statements suggests that you have data corruption going on. This can often be difficult to diagnose - impossible without a test case! Sometimes Intel Inspector XE's memory analysis can detect problems, and you should turn on all run-time checking, not just uninit, which is not as effective as one would want.

0 Kudos
Rob1
Beginner
7,257 Views

Yes i have read that thread, i actually read it again last night before posting this.

I've tried turning on various run time checks, the problem is some of them when turned on cause "warning #10182: disabling optimization; runtime debug checks enabled", which causes the program to run without errors.  I only seem to ever experience these issues with optimization on.

0 Kudos
Steven_L_Intel1
Employee
7,257 Views

/check:stack is the one that triggers that message. You could try doing without it.  What does /standard-semantics do for you here? You mentioned allocatables - without that, assignments to allocatables (other than deferred-length character) don't get automatically (re)allocated.

0 Kudos
Rob1
Beginner
7,255 Views

I get the same error on the same line with /standard-semantics.

Sorry I don't follow you about allocatables... where did i mention them?  Are you saying that i have an allocatable (array) issue?

0 Kudos
jimdempseyatthecove
Honored Contributor III
7,258 Views

>>Unhandled exception at 0x0162EF4C in MAIN.exe

Is this a mixed language program with a C/C++ main? (or WinMain?)

If so, from IVF documentation:

When the main program is written in Fortran, the Fortran compiler automatically creates any code needed to initialize the Fortran Run-time Library (RTL). The RTL provides the Fortran environment for input/output and exception handling. When the main program is written in C/C++, the C main program needs to call for_rtl_init_ to initialize the Fortran RTL and for_rtl_finish_ at the end of the C main program to shut down the Fortran RTL gracefully. With the Fortran RTL initialized, Fortran I/O and error handling will work correctly even when C/C++ routines are called.

Jim Dempsey

0 Kudos
Rob1
Beginner
7,257 Views

No, it is a fortran main.

There are a few C routines, but none of them are called.  My first reaction to this error was to compile/link without the C objs and comment out the calls to them even though they aren't used in this case but it made no difference.

0 Kudos
Steven_L_Intel1
Employee
7,257 Views

There is nothing more we can do here without a test case. You can submit it to Intel Premier Support if you prefer, but I see you may be using MATLAB, which complicates things.

I'll make the general comment that there is no single cause for access violations. You have to approach each case individually. As I mentioned earlier, given the changeable behavior, it is very likely that memory is getting corrupted or perhaps your program is relying on uninitialized memory. I generally start at the point of error, see where the bad address comes from (analyzing the instruction stream and registers), and work backwards from there. It is usually difficult.

0 Kudos
Rob1
Beginner
7,257 Views

yeah that's about what i expected.

How do i go about "analyzing the instruction stream and registers"? is there any good information on how to do this online somewhere?

thanks Steve

0 Kudos
mecej4
Honored Contributor III
7,257 Views

Rob wrote:
How do i go about "analyzing the instruction stream and registers"? is there any good information on how to do this online somewhere?
Your asking this question is an indication that it is probably not a viable option for you at this time, unless you are familiar with the CPU instruction set and Fortran/C/Matlab ABI-s for your current platform.

As Steve recommended, provide a test case. Include the Matlab code, the C and Fortran code, along with information about versions of the software packages used, and instructions to reproduce the problem.

0 Kudos
jimdempseyatthecove
Honored Contributor III
7,258 Views

When a program does not produce this symptom in debug build, but does in release build, it is often an indication of use of uninitialized data. A typical example of this is the assumption of a once only flag being initialized to 0 (when no initialization is expressly made).

Jim Dempsey

0 Kudos
Rob1
Beginner
7,257 Views

It crashes on this line:

012505EC  vmovsd      xmm4,qword ptr [edi+edx*8-8]

where:

EDI = 00000000

EDX = 00000001

So it looks to me that EDI should not be zero and/or EDX should not be one.  I briefly pondered tracing the memory/registers back, but i think it would take me the rest of my life to do.

 

Thanks for the tip Jim, i tried compiling with Fortran>Data>Initialize Saved Variables to Signaling NaN = Yes (/Qinit:snan) and it threw the following error on a line where an array of zeros was divided by another array of zeros:

	ratioarray = array1/array2

forrtl: error (65): floating invalid (Multiple floating point traps)

To clarify there aren't any uninitialized variables, but for some reason changing the compiler setting above causes the above error to be thrown even though the calculation arguements are unchanged (0/0).  In this case this ratioarray of NaNs isn't even used at all.  If i comment this line, it runs without throwing the "access violation reading location..." error.  So it would seem that these calculations are influencing the memory somehow at least with (/O3).

...which leads me to the question: can these 0/0 calculations corrupt other memory?  I was aware that this calculation was happening and i don't use the elements that are NaN regardless, at most i use a portion of this array (not the NaNs) where almost always some unused elements are NaN.  Also there are a couple other places where i'm intentionally generating NaNs by calculating 0/0, is this bad?

thanks for the help

rob

0 Kudos
andrew_4619
Honored Contributor III
7,257 Views

If there is stuff that is not used more than likely optimisation removes that code entirely. The layout of your code is thus entirely different so any memory corruption will influence different things and have different symptoms (or none that may be evident).

If you code has div zeros which are in fact bugs what other problems do you have? You have have /check:uninit  /check:stack on runtime and /warn:interfaces /warn:declarations?

0 Kudos
jimdempseyatthecove
Honored Contributor III
7,257 Views

>>012505EC  vmovsd      xmm4,qword ptr [edi+edx*8-8]

Fortran arrays are typically 1-based. What the above is likely performing is

edi is holding the base of the array
edx is holding the index to the array
the *8 indicates the element size is 8 (real(8))
the -8 removes the 1 for the 1-based array indexing.

The problem is that the base of the array (edi) is 0. IOW your array argument is invalid.

Other than for uninitialized variables, causes for seeing 0 or junk in an array address are:

1) A prior array reference (store) with index out of bounds corrupts an array descriptor.
2) A prior array reference (store) with index out of bounds corrupts the stack (this may alter either an array descriptor or an address of an array descriptor)
3) Inconsistent interfaces
4) Calling a subroutine or function that returns a pointer to an array where the array existed on the stack of the called subroutine or function.
5)... others...

For your current error, tracing back to where edi is loaded might yield information as to what is getting corrupted.

I suggest making a test run with array subscript bounds runtime check enabled. This may catch 1) and 2).
Also make a test run with gen-interfaces, check-interfaces. This may catch 3), it won't catch interoperability calls

Problem 5) is a little harder to catch automatically. After addressing 1) to 4), visually inspect all of your functions and subroutines and locate the ones that return an array or array subsection. Be mindful that in Debug build, locally declared arrays have SAVE attribute. If the Release build builds for OpenMP or is the latest Fortran standard, then locally declared arrays are AUTOMATIC (stack based). If the function or subroutine is returning an array or section of an array that is local to the function or subroutine, the returned descriptors will work up until the point where that portion of the stack gets reused. The reuse (corruption) will then occur by something totally unrelated to the code experiencing the corruption.

Jim Dempsey

0 Kudos
mecej4
Honored Contributor III
7,256 Views

The number of different ways in which a null pointer can be dereferenced in Fortran is probably quite large. Jim Dempsey gave you a few examples. Here is another, with a  small blunder: an allocation statement is "forgotten".

subroutine sub()
integer, pointer, dimension(:) :: iptr => NULL()
integer :: i,j
!
! allocate(iptr(2))       ! OOPS, commented out

i=2*iptr(1)
j=3*iptr(2)

write(*,*) i+j

return
end subroutine

The assembly of this subroutine contains the following lines:

mov         eax,dword ptr ds:[0]
lea         ecx,[esp+20h]
mov         dword ptr [ecx-20h],0

lea         edx,[eax+eax*4]
mov         dword ptr [ecx],edx

The very first line of assembly that I showed references address zero.

We can go on and give many more examples, but they may not help you. That is why we asked you to give us your code that exhibits this problem.

0 Kudos
Rob1
Beginner
7,256 Views

I apologize macej4, but i cannot give the code or data out publicly.  if i could i would.

Jim, i have always had Fortran>Run-time>Check Array and String Bounds>Yes(/check:bounds) selected.  Also i've always had Fortran>Diagnostics>Compile Time Diagnostics>Show All (/warn:all) which includes (/warn:interfaces).  I'm unable to find gen-interfaces but I read elsewhere that having /warn:interfaces on will also turn on gen-interfaces... is this true?  If so these options don't seem to pinpoint the problem.

I made the changes that i mentioned in my last post which prevented the  "access violation reading location..." error for that particular data set.  However today i ran across the same error on the same line with a different data set.

The fortran and assembly code:

FUNCTION cross3(A,B)
! calculates cross product of A and B
IMPLICIT NONE
!-------------------- begin function parameters --------------------
REAL(8), DIMENSION(3) :: cross3,A,B		! 3 element vectors
!--------------------  end function parameters  --------------------

cross3(1) = A(2)*B(3) - A(3)*B(2)
cross3(2) = A(3)*B(1) - A(1)*B(3)
cross3(3) = A(1)*B(2) - A(2)*B(1)

ENDFUNCTION cross3
REAL(8), DIMENSION(3) :: PW_RFP
REAL(8), DIMENSION(3) :: B0V_RFP
REAL(8), DIMENSION(3) :: PA_RFP                        
...
PW_RFP = cross3(B0V_RFP,PA_RFP)/(NRM2(B0V_RFP)**2)     ! <<<CRASH ON THIS LINE
014C4794  mov         edi,dword ptr [esp+14h]  
014C4798  sub         ecx,eax  
014C479A  mov         eax,edx  
014C479C  vmovsd      xmm4,qword ptr [edi+edx*8-8]     ! <<<CRASH ON THIS LINE


I have had this problem several times off and on for years, and this isn't the first time that it has occurred on a line which contains NRM2.  I wonder if it's just a coincidence or perhaps the BLAS routine is a lightening rod for these types of problems that originate elsewhere.

Anyway, i'm learning assembly by reading Art of Assembly in my spare time.  I suppose i need to follow the trail starting at dword ptr [esp+14h].

0 Kudos
nvaneck
New Contributor I
7,258 Views
I've gotten this kind of error when a call to a subroutine a few steps back left out the last argument when I didn't have interface checking active.
0 Kudos
andrew_4619
Honored Contributor III
7,258 Views

It doesn't fix your problem (which looks like one that will be a hard slog) but F2008 has an intrinsic norm2 which can replace the blas nrm2 and thus removes some external clutter....

0 Kudos
andrew_4619
Honored Contributor III
7,258 Views

nvaneck wrote:

I've gotten this kind of error when a call to a subroutine a few steps back left out the last argument when I didn't have interface checking active.

Or indeed malformed interfaces to C or API routines that the compiler cannot check. The error can be some distance from the point of crash...

0 Kudos
jimdempseyatthecove
Honored Contributor III
7,258 Views

Mecej4 may be correct. You can also insert an assert into cross3 (remove it after locating the bug).

IF((LOC(A(1)) .LT. 4096) .OR. (LOC(B(1)) .LT. 4096)) THEN
   STOP "BAD ARG" ! PLACE BREAK POINT HERE
ENDIF

My SOP is to create a subroutine DOSTOP(file,line,msg) then use something like:

IF((LOC(A(1)) .LT. 4096) .OR. (LOC(B(1)) .LT. 4096)) CALL DOSTOP(__FILE__,__LINE__, "BAD ARG")

The DOSTOP is then compiled with debugging and no optimization. Make sure it is not inlined by IPO. Place the break point on the PRINT statement in DOSTOP (this requires only 1 break point regardless of number of calls to DOSTOP and note that the sanity test can be performed in release code without much overhead. Also note, when at break, you can change scope to the caller by opening up the Stack window and double clicking on the caller's level (next level away from DOSTOP). This should let you view the variables of the caller. If (when) this doesn't work, back to DOSTOP window/source, click on the statement after your STOP, then right click on it and choose Set Next Statement at cursor. Then single step out of DOSTOP and then examine variables, arrays, etc... Also, to get an address of a variable use the Memory Debug window to display the "variable", you are not interested in the content, but you are interested in the address.

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
5,126 Views

Here is actual code from a project of mine:

RECURSIVE SUBROUTINE DOSTOP(CVAR)
    CHARACTER*(*) CVAR
    WRITE(IOUERR,*) CVAR
    WRITE(*,*) CVAR
    STOP 'DOSTOP'! place break point here
    RETURN       ! Right-Click here, Set Next Statement Here, then step out
END

.....

RECURSIVE SUBROUTINE VECNRM_r_r (VA, VC)
  USE MOD_ALL
  use MOD_UTIL
  implicit none
!     
!************************************
! COMPUTATION OF A UNIT VECTOR IN THE DIRECTION OF VECTOR VA
! VC = UNIT(VA)
! (CALCULATION IS PROTECTED ALLOWING INPUT TO BE OVERWRITTEN)

  real :: VA(3), VC(3)
  real :: DUMAG, OOMAG
! CHECK MAGNITUDE BEFORE ATTEMPTING NORMALIZATION...WRITE TO SHOERR
  DUMAG = VECMAG(VA)
  IF(DUMAG .EQ. 0.0) CALL DOSTOP( 'VECNRM: NULL VEC MAGNITUDE')
! FIND RECIPROCAL OF MAGNITUDE OF VA
  OOMAG = 1.0/DUMAG

! CALCULATE COMPONENTS OF UNIT (VA)
  VC(1) = VA(1)*OOMAG
  VC(2) = VA(2)*OOMAG
  VC(3) = VA(3)*OOMAG

  RETURN
END SUBROUTINE VECNRM_r_r

Jim Dempsey

0 Kudos
Reply