Solved: Quote:I'll have to reformat - Page 2

fourreal · ‎01-30-2017

I've just upgraded to Fortran 2016 from 2013 and now throw a floating point invalid operation exception 0x90. The error is trapped at the declaration of a function. I've reduced to the parameter list to a few reals and integers, and the parameters look good before the function is called and at the top of the function. The exception is thrown before the first executable statement of the function.

I have tried different /fpe settings, debug & release, and throwing salt over my shoulder. Any help would be appreciated.

Windows 10, Visual Studio 2015

mecej4 · ‎02-01-2017

I'll have to reformat the build log from html to txt before I'm allowed to upload it.

You can place files of any type into a Zip file and attach the Zip file in your response.

You use the options /Qsave /align:commons /Qtrapuv /Qinit:zero. If you are debugging code, especially code written in recent decades, /Qsave and /Qinit:zero should not be used -- they hide lots of bugs. Furthermore, after you use these options, /Qtrapuv will trap nothing that I can think of, since you have replaced uninitialized variables, if any, by zero.

The net effect of using bug-hiding options such as these with buggy code is to make finding the bugs much harder, and possibly to hide the bugs for decades if these options are also used in compiling production code.

View solution in original post

fourreal · ‎01-31-2017

I now have Runtime Error Checking /check:all. This was not the case before. The results are unchanged. So what is a reproducer?

mecej4 · ‎01-31-2017

A reproducer consists of at least one small source file along with instructions on how to compile, link and run the program(s) in order to display the bug or undesired behavior that a user experiences in a much larger application. It is usually obtained by pruning away parts of the larger application source code and checking that the bug is still present.

If necessary, include files, module files and input data files should be supplied in order to enable the reproducer source to be compiled, linked and run. If the problem occurs only when certain compiler options are used, or only on certain platforms, details regarding the options and/or the platform need to be provided.

Here is an example: of a reproducer that I submitted a few months ago: https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/685354 .

IanH · ‎01-31-2017

It might seem silly, but make sure the DLL being loaded by your program is the DLL you think should be loaded. While debugging, you can use the "Modules" window (Debug > Windows > Modules) to inspect the details of loaded DLL's. I would also check that the Fortran runtime DLL's are the ones appropriate for the version you are targeting.

fourreal · ‎01-31-2017

I don't think a reproducer will reveal anything except that the function is trivial. That's been done. The function fails in this context for reasons that are beyond my experience. I had hoped someone else had experienced the same issue and could tell me about a particular quirk in the 2016 installation.

The program DLL is correct, but I don't know what to look for in Fortran runtime DLLs. I don't take checking the DLL lightly; I've seen Visual Studio choose DLLs seemingly out of the blue sky before.

I removed my previous Fortran versions before installing the 2016 version, so maybe I can't have the wrong runtime DLLs.

IanH · ‎01-31-2017

The modules debugging pane lists the path to each DLL. For the Intel runtime library DLL's this path should be in a subdirectory of the intel compiler installation that matches the version you are compiling with. Example runtime DLL's include libifcorexxx.dll and libmmxx.dll (the x characters will vary depending on compile options). When debugging from within Visual Studio it is unlikely that the wrong DLL's will be loaded, but outside of Visual Studio other applications can ship a version of the runtime that may interfere.

Which update (specific version number) of 16.0 are you compiling with? What specific compile options are you using? Are you building for 32 bit or 64 bit windows? Can you show the build log?

When you reduce your code to the minimum example that mecej4 has in #20 (or Andrew in #21) - do you still get the exception?

If so, then its time to start looking at details of the state of your process - e.g. showing the disassembly leading up to the point the exception is raised. Can you make available a mini dump or similar of the process?

If not, then there's something astray in the instructions leading up to the call (noting that some times a floating point exception may be reported against a later instruction than the instruction that actually triggered it - if that is what is going on then the nature of the function is a red herring).

FortranFan · ‎01-31-2017

fourreal wrote:

I don't think a reproducer will reveal anything .. The function fails in this context for reasons that are beyond my experience. .. I've seen Visual Studio choose DLLs seemingly out of the blue sky before.

I removed my previous Fortran versions before installing the 2016 version, so maybe I can't have the wrong runtime DLLs.

@fourreal,

I think you're mistaken: if you follow almost any thread in this forum where Intel Black Belts such as mecej4 and Jim Dempsey help readers, you will notice they don't necessarily need to have " experienced the same issue"; their expertise can rise above any such considerations. If you can follow-up as suggested above, it will go a long way to resolve your issue.

Secondly, Visual Studio does not "choose DLLs seemingly out of the blue sky" - there is a method it follows and it can take time and effort to understand it and until one does, it may look like madness but the bottomline is the DLLs loaded are a function of the settings on your system.

At the very least, you should be able to take the same steps as illustrated by mecej4 in Message #20 and report here if it works as shown in that message - if it does work, then it will prove to you the issue is not the 2016 version of Intel Fortran compiler or the Fortran code per se but likely your project and DLL setup etc.

mecej4 · ‎01-31-2017

fourreal wrote:
I don't think a reproducer will reveal anything except that the function is trivial.

Then it is not a reproducer, by definition. Creating a reproducer can be quite difficult, especially if an optimizer bug is involved. At any rate, the suggestion to create a reproducer is not one to be dismissed glibly.

The function may be trivial, and so may the caller. Nevertheless, it can be the combination of caller and callee, plus a possible compiler bug, that caused an error.

andrew_4619 · ‎02-01-2017

The logical step is to establish that a simple case like #21 works with the same build options you use. Then take your application and pair away code until you are left with a small program that still fails, that is a reproducer. [Alternatively if you can share the entire existing code with some build instructions and input data someone else can look at it.] What is very often found is that at some point in the paring way process it starts to work ok! This then gives some clues to a problem in some other part of the code. It can be hard work!

fourreal · ‎02-01-2017

I checked the modules debugging pane for any lib*.dll, found nothing, and assumed I wasn't doing it right. I've since checked everything in the modules pane against every DLL I can find under ..\IntelSWTools\. There are no common DLLs listed.

The update version is 2016.4.246. My compiler options are:
/nologo /debug:full /Od /I"Debug/" /I"C:\design\frame\dev\solver\solver_lib\\HeaderFiles\\" /arch:IA32 /fixed /extend_source:132 /warn:truncated_source /integer_size:16 /Qsave /align:commons /Qtrapuv /Qinit:zero /fpe:1 /Qfp-stack-check /module:"Debug/" /object:"Debug\\" /Fd"Debug\vc140.pdb" /traceback /check:stack /libs:static /threads /4Yportlib /c

I'm building for a 32 bit application. I'll have to reformat the build log from html to txt before I'm allowed to upload it.

I'll work on a disassembly before the call. After the paring, I realize the fault isn't in the function but I wasn't aware that the exception may have occurred already without an interrupt. I'll first spend some time picking through the code ahead of the call in search of a boil to lance.

mecej4 · ‎02-01-2017

I'll have to reformat the build log from html to txt before I'm allowed to upload it.

You can place files of any type into a Zip file and attach the Zip file in your response.

You use the options /Qsave /align:commons /Qtrapuv /Qinit:zero. If you are debugging code, especially code written in recent decades, /Qsave and /Qinit:zero should not be used -- they hide lots of bugs. Furthermore, after you use these options, /Qtrapuv will trap nothing that I can think of, since you have replaced uninitialized variables, if any, by zero.

The net effect of using bug-hiding options such as these with buggy code is to make finding the bugs much harder, and possibly to hide the bugs for decades if these options are also used in compiling production code.

fourreal · ‎02-01-2017

Bingo: >>> /Qtrapuv <<<

Problem solved. Thanks

Steve_Lionel · ‎02-01-2017

/Qtrapuv is just hiding the problem of an uninitialized variable. It doesn't trap anything, but it does change the processor flags so that floating invalids may go undetected, and it sets floating variables to an "unusual" value, but not a NaN. You still have a program bug (most likely) but now it's hidden.

I would try /Qinit:snan /Qinit:arrays instead.

fourreal · ‎02-01-2017

Steve Lionel: /Qtrapuv was set, now it's not. By not hiding problems, I have no problems. Am I misunderstanding you?

fourreal · ‎02-01-2017

Note that I took /Qtrapuv out of the compiler switches.

mecej4 · ‎02-01-2017

fourreal wrote:
Am I misunderstanding you?

Yes, you do not seem to realize that options such as /Qzero, /Qsave, etc., which exist to enable old Fortran codes from the mainframe and CVF era to continue to run, are options that, when used without good reason, simply hide bugs.

Let me illustrate. Suppose you have a calculation in which you have a variable, c_0, which is supposed to be set equal to the velocity of light in vacuum, and is used in several places in the program. However, the programmer forgot to initialize c_0. As a result, the output of the program is unpredictable and, in general, garbage. However, unless the output results are sufficiently bad, that they are incorrect may go unnoticed for a long time. Now, suppose that the option /Qzero was also used. The calculations are now going to use c_0 = 0 instead of the known value from physics, which is approximately 3E8 m/s. Will the error be noticed? That depends entirely on what the program calculates. It is because of these pitfalls that I recommend that you never use /Qzero unless you are using, without modification, a mainframe era program which was written with the assumption that variables were implicitly initialized to zero.

Steve_Lionel · ‎02-01-2017

If taking out /Qtrapuv made the error go away, then you almost certainly have an uninitialized variable. For diagnostic purposes, use /Qinit:snan /Qinit:arrays. This will initialize all floating point variables to a "signaling NaN" and should give you an explicit error message when one of these is accessed. You can then trace the flow back to see where the value came from. I don't recommend leaving any kind of /Qinit option on in production code.

Floating point invalid operation