I need to allow access violations

Stephen_Painchaud · ‎01-02-2008

I am porting a very old numerical library from Solaris to the PC. My test code correctly runs under Solaris.

The authors of this code seem to make liberal use of access violations. Arrays that start with index 1 are often accessed with a negative index. I am not sure why, since the code comments are in Russian. It may seem strange that I would deal with this code, but it has unique capabilities that I can not seem to get elsewhere.

Under IVF10 I get access violations, so the code wont run. Is there any way to turn off this error for the library code?

jimdempseyatthecove · ‎01-02-2008

Are you certain the authors use was indeed an access violation?
i.e. accessing memory not intended to be accessed.
e.g. passing the address of an element of an array which is not the 1'st element of the array and then using a -1 (or 0)index from the dummy argument specified thus reaching a valid element of the uplevel array?

/CB is check bounds, try /CB-
or
/check:bounds, try /check:nobounds
or
/nocheck

Note, if you modified the code (including data layout) you may be changing assumptions of the programmer (requirements of the program).

Jim Dempsey

jimdempseyatthecove · ‎01-02-2008

Also,

The IVF10 Debug Library Run Time may contain bounds checking whereas the Release version may not. If you experience this error in Debug configuration then try explicitlyspecifying the Release libraries in the link property sheet.

Jim Dempsey

jimdempseyatthecove · ‎01-02-2008

Can you provide the name of the IVF runtime library routine and describe the contents of the arguments passed when error is produced. This might provide some insight to what is going on.

Jim Dempsey

Steven_L_Intel1 · ‎01-02-2008

You can't "turn off" access violations - that is a system error that occurs when you access an invalid memory location. There's no way to "ignore" such an error.

Sometimes people would write code that assumes that arrays are contiguous in memory and would access outside the bounds of one array intending to get an element of another. It may be that your code does this, but whatever assumption it made about memory layout is not true on the new system.

The only solution is to understand what the program is doing and figure out what is necessary to make it work as designed.

Stephen_Painchaud · ‎01-02-2008

I am still doing initial testing of this library with my test code on Solaris. What I see so far is that I am getting the correct answer, but I can print out the array indices, and there is definitely access violations. The negative indices seem to follow a pattern, so I believe it is intentional. The violation occurs in a subroutine that is called as follows:

CALL MYSUB(-I, -N)

where -I and -N are indices into an array. Since I and N are ususally positive, I suspect this is intentonal

This code was written in Russia in 1985. From experience, I know that Russians don't like anyone limiting their use of arrays in FORTRAN.

The access violations occur several levels down into the library. I have checked my inputs at the top level, and they seem fine.

BTW, the library is LIDA. I am trying to use the multi-dimensional spline fit it provides. I expect to use it for 5 and 6 parameter fits.

Stephen_Painchaud · ‎01-02-2008

Steve,

Thanks for the info. I expected this answer.

There is a lesson to be learned here. Man-years of work will probably be lost, because someone thought they were clever with their code. It is very unlikely that this code will ever be updated.

Steven_L_Intel1 · ‎01-02-2008

Before you give up - try this wild idea. Add the option /assume:noprotect_constants. (In Visual Studio this is Data > Constant Actual Arguments Can Be Changed > Yes.)

jimdempseyatthecove · ‎01-02-2008

Steve:

>>/assume:noprotect_constants

Is this implying that if the code contains

N=1234
CALL FOO(-N)

That the compiler may optimize that to

CALL FOO(-1234)

i.e. pass the address of a constant (-1234) as opposed to passing the address of the variable N.

Jim Dempsey

jimdempseyatthecove · ‎01-02-2008

Stephen,

>>I am porting a very old numerical library from Solaris to the PC. My test code correctly runs under Solaris.

In your port you may have disturbed some equivalences (or the compilers treat them differently) such that where a symbol on theSolaris system is properly positioned within an array, whereas on the PC the same symbol is distinct from the intended array position.

Note, the required data alignment may not only be attained by use of equivalence but also by use of differing declarations of commons.

This often occurs when the revising programmer attempts to clean-up the code and inadvertently changes something that is not obvious.

Jim Dempsey

jimdempseyatthecove · ‎01-02-2008

I mean passing the address of the volatile temporary for the expression -N

Jim

g_f_thomas · ‎01-02-2008

LIDA appears to be alive and well and livingin Java:

http://www.sscc.ru/matso/rozhenko/applib/index.html

Gerry

Steven_L_Intel1 · ‎01-02-2008

JimDempseyAtTheCove:
>>/assume:noprotect_constants
Is this implying that if the code contains

N=1234
CALL FOO(-N)

That the compiler may optimize that to

CALL FOO(-1234)

i.e. pass the address of a constant (-1234) as opposed to passing the address of the variable N.

No - in the case you cite, -N is an expression, not a variable, so the variable N is not passed in any event. What gets passed there is the address of a stack temporary containing the value -1234.

What this option controls is what happens when you say:

CALL FOO(-1234)

and routine FOO assigns into the dummy argument associated with the constant. By default, literal constants such as -1234 are allocated in a read-only data section so that if the called routine tries to store into the constant, it fails with an access violation. /assume:noprotect_constants causes literal constants as actual arguments to act like expressions so that the compiler constructs a stack temp for the value and pases that. The called routine can (though shouldn't) store into that location, and any changes disappear on return from the routine.

We have seen a lot of code that sometimes passes a variable, and sometimes a constant to a routine which then assigns to it. In some compilers (including old versions of DVF and IVF), constants were allocated in read-write sections and you could "change the value of 3" in this manner.

The Fortran standard disallows redefining or changing the definition status of a dummy argument associated with a constant or expression actual argument, so a correct program would never notice the difference. But as we all know, not all Fortran code out there is correct!

The reason I suggested this was that I had doubts that referencing an array by a negative index would cause an access violation. Sure, it's possible, but much more likely that it would just read from some other variable. It seemed more likely to me that an access violation was caused by a write to a constant actual, as I have seen that in a lot of code.

For my past musings on this topic, see Don't Touch Me There.

Stephen_Painchaud · ‎01-03-2008

I am not changing any code yet. The "port" is just a matter of setting up the project.

Steven_L_Intel1 · ‎01-03-2008

Try the /assume:noprotect_constants and see what happens.

jimdempseyatthecove · ‎01-03-2008

Now that Steve pointed it out, if the error message is "memory access error" instead of "subscript error" or "error writing to"then this indicates that the resolved memory access of some reference is outside the range of valid virtual address space for the application.

Negative array subscripting (with subscript checking off) will generally produce a valid memory address (although not necessarily the one intended). Therefore the negative subscript will not necessarily cause a "memory access" error.

A potential source of the problem is producing a stack relative address that causes the resultant address calculation to wrap (underflow or overflow). This can occure when subroutine/function local arrays, between Solaris and PC (dependent on compiler options), reside either on the Stack or in Static memory. Related options:

/4{Y|N}a enable/disable putting local variables on the run-time stack
/Qauto same as /4Ya or /automatic
/Qauto-scalar make scalar local variables AUTOMATIC
/Qsave save all variables (static allocation)
(same as /noautomatic or /4Na, opposite of /Qauto)

Another situation which might not be solvable without fixing the library is if the -N (negative index) is passed in as a flag (e.g. initialization indicator), but prior to being tested for being negative it reads something at the specified index (which will be igored in initialization code). If the Solaris permitted reading of invalid addresses (e.g. returning 0xFFFFFFFFFFFFFFFF or 0x0000000000000000 for non-existent memory) and whereby the current O/S (Windows) returns an exception error then the working code breaks. I can't imagine that this would be the case if this library were ever used on a Windows platform.

Note, if you are linking in a library built for aUnix system you should be aware that an address of -4 is typically a valid virtual addressunder Unix/Linux but invalid under Windows. Therefore a benign negative address returning junk on a Unix/Linux system wouldcreate an exception on Windows.

By the way, what is the address as reported?

Jim Dempsey

g_f_thomas · ‎01-03-2008

Since I've yet to see the OP's response to Jim's cogent suggestion that /checkbounds be turned 'on' I spent a few minutes finding out for myself. As anticipated, IVF will find errant out of bounds accesses and the proggy won't compile. If cb is 'off' something like

real A(10,10) : integer j

A=2.5

neither

write(*,*) A(-5,5), (A(-1,j),j=-5,5)

nor

i=-2

j=5

A(1,1) = A(i,j)

generates an access violation in the Windows sense but may produce wrong results.Additionally, if your proggy has a C main (that calls a Fortran lib), neither Windows SEH (Structured Exception Handling) nor /traceback indicates an abnormal termination.

BTW there's lots of spline fitting code in f90+ available on the net.

Gerry

Stephen_Painchaud · ‎01-04-2008

Sorry for the slow response, but I have to time share between projects.

I first created test code on Solaris. When it was working, I moved it to IVF. I now see that I have no problems with the release version of the code. The debug version gives the error:

forrtl: severe (408): fort: (3): Subscript #2 of the array AL has value 0 which is less than the lower bound of 1

and then crashes.

I am OK as long as I don't need to use the debugger. I originally thought I would need to, but I seem to have gotton past that point for now.

Obviously the original authors are doing something creepy with their indices, but I can test the code to insure it works, so I guess that is all I need.

Thank you everyone for your help.

Steven_L_Intel1 · ‎01-04-2008

You can turn off array bounds checking in the Debug configuration (under Run-Time) if you want. A lot of old code plays fast and loose with array indexing rules.

Stephen_Painchaud · ‎01-04-2008

I will do that in the library project. Eventually I will learn where all these switches are.

jimdempseyatthecove · ‎01-04-2008

Stephen,

>>forrtl: severe (408): fort: (3): Subscript #2 of the array AL has value 0 which is less than the lower bound of 1 and then crashes.<<

Configure for the Debug session with array bounds checking and producing the subscript errors.

Verify on call that the (seemingly) invalid subscript is indeed valid for that subroutine. Then cross fingers and assume other (seemingly) invalid subscripts are also valid. Then in the SolutionExplorer pane of Visual Studio (while in Debug configuration), expand the tree(s) such that the source file containing the subroutine is visible. Right-click on file (left-click if left handed mouse), pick properties, and turn off array subscript checking for that one file in Debug configuration. Rebuild and run to next subscript error or other error. Verify correctness, turn off array subscript checking for that next subroutine, etc...

Verifying all subroutines being givenunusual subscripts is better than assuming ifthe program runs without crashing that it indicates that the results are correct.

Jim Dempsey