We've been having persistent 'unable to start' or can't find errors, even though the executable has been made and is correctly located. The executable will not run stand alone; I suppose for wrong libraries?? Sometimes things are ok and then they switch back. When not ok, debug version may run (incorrectly) but not release. The forum seems to indicate this may be a PATH problem. I'm using VS 15 & the latest Fortran 2019 update 1. The software runs correctly on a laptop w/ very little on it. I cannot seem to install Fortran on community VS 2017. Don't really care.
Q1: Is there a tool in VS2015 to tell it where to look (like the one in Amplifier?)
Q2: Here is my PATH; as far as I can tell, it seems benign and current.
Path=C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files (x86)\MSBuild\14.0\bin;C:\Program Files (x86)\MSBuild\14.0\bin;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64;C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\Tools;C:\Windows\Microsoft.NET\Framework\v4.0.30319;C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\VCPackages;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Team Tools\Performance Tools;C:\Program Files (x86)\Windows Kits\10\bin\x86;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.6.1 Tools\;C:\Program Files (x86)\Intel\..\..\intel64\libfabric\bin\utils;C:\Program Files (x86)\Intel\..\..\intel64\libfabric\bin;C:\Program Files (x86)\Intel\..\..\intel64\bin\release;C:\Program Files (x86)\Intel\..\..\intel64\bin;C:\Program Files (x86)\IntelSWTools\Advisor 2019\bin32;C:\Program Files (x86)\IntelSWTools\VTune Amplifier 2019\bin32;C:\Program Files (x86)\IntelSWTools\Inspector 2019\bin32;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2019.1.144\windows\mpi\intel64\bin;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.2.185\windows\mpi\intel64\bin;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.0.124\windows\mpi\intel64\bin;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017.4.210\windows\mpi\intel64\bin;C:\ProgramData\Oracle\Java\javapath;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.0.065\windows\mpi\intel64\bin;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017.2.187\windows\mpi\intel64\bin;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2017.0.109\windows\mpi\intel64\bin;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64_win\mpirt;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\ia32_win\mpirt;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64_win\compiler;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\ia32_win\compiler;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Windows\system32\config\systemprofile\.dnx\bin;C:\Program Files\Microsoft DNX\Dnvm\;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files\SciTools\bin\pc-win64;d:\Program Files (x86)\IDM Computer Solutions\UEStudio;C:\Program Files (x86)\IDM Computer Solutions\UltraCompare;C:\Program Files\Microsoft SQL Server\110\Tools\Binn\;D:\Program Files\PuTTY\;C:\Program Files\dotnet\
The main use of this computer is to compile & run my software in Fortran so I don't mind dumping anything that is causing a problem.
I just spent a week fooling with my code only to discover it is not the problem. Most appreciative of any help in this regard. thanks.
Steve: you're right, I've mixed two problems in this discussion. I was pursuing the issue of the same code giving two different answers when I was stymied by not being able to run the code at all. The latter problem seems to have abated either because of the path changes or, more likely, because I followed your suggestion and backed off the stack size. This let me return to the basic problem I was pursuing: namely, the identical code and compiler/linker options behave differently depending on what was run previously on the computer. It's about 4000 lines of complicated, mostly OpenMP code, so it is very resistant to being stripped down to a 'reproducible example'. I've tried to cut it apart but most of the code has tight interdependencies. What I do know is that running a two-year old, earlier version of the program from a different directory 'resets' things so that it will produce correct answers for a bit. I have no idea how that could happen & it seems like a MS/Intel issue as my user code should not be able to do what it is apparently doing. I understand that w/o a small example program, y'all are very limited in response but, if I could do that I might not even have to ask for help...or, at least, pose a much more specific question to the appropriate Intel team.
Since OpenMP multiplies code complexity, and sometimes makes debugging tricky, I think it has to be a suspect. Apologies for asking this again - is it possible to limit your program to a single CPU, i.e. to run in serial mode?
That a previously built executable "works" again suggests a coding error revealed by a new compiler with new or different optimizations. Just rearranging the layout of things in memory is enough to reveal or hide such issues.
I agree with gib that you should first see if you can reproduce a problem with a single thread. Intel Inspector XE (30 day free trial available, included in Professional or Cluster editions) has run-time correctness checkers that can point you to possible problems (though with Fortran code I found it also produces a lot of "noise".)
Hi Steve. I'm not being clear about the base problem. a two-year old version of the code that has does not have the last two years of advances. A current version that does have these enhancements. Only one compiler (2019v1) and VS2015. Run new version & it gives wrong answer. close new version. Open old version that is in its own directory, recompile (no changes) and run. Gives correct answers. Close old version. Open new version in its own directory & it gives the correct answer. Run more and it soon reverts to bad results. Repeat.
Stack and heap randomization. You have out of bounds variable references or other memory errors. Difficult to debug, but the way I would approach it is to selectively disable optimization on some sources until you (one hopes) find the source which, when the settings are changed, triggers the behavior. You may have to run multiple times. Alternatively, instrument the code with writes of intermediate data to a file, but this will perturb the behavior.
yeah, there are writes cropping up like mushrooms (or uncommented as they are being reactivated). I'm afraid I'll have to learn the VS debugger. The debug version has optimizations turned off.
Bruce Weaver wrote:
... It's about 4000 lines of complicated, mostly OpenMP code, so it is very resistant to being stripped down to a 'reproducible example'. I've tried to cut it apart but most of the code has tight interdependencies. ... I understand that w/o a small example program, y'all are very limited in response
A program of that size is big enough to cause trouble, but not too big to discourage debugging. Are you willing to share the source code (complete with include/header files, data files, if any, and directions to reproduce the problem)? Just as parallelization of a calculation can speed up the runs, one can "parallelize" the debugging effort, if your "bug description" enlists many helpers.
Bruce Weaver wrote:
Open new version in its own directory & it gives the correct answer. Run more and it soon reverts to bad results. Repeat.
A bug of that nature is very difficult to catch with just a symbolic debugger, for the following reasons:
>> The debug version has optimizations turned off.
That is the default. You should be aware that in the Debug configuration you can specify the optimization level. Remember to disable IPO as this is incompatible with the Debugger (at least through V17.0).
Also, keep in mind that the optimization level can be set on a Project level and/or as well as on a file-by-file basis (Open Solution Explorer and Right-Click on a file of interest, then you can change optimizations for that file only).
When debugging multi-threaded code and a break point is reached (e.g. on an assert you insert into your code), you can (in MS VS) use the Threads window to freeze and unfreeze any/all other threads to aid single stepping (as well as for selecting context).
I spent the day trying to work with VS debugger. Very inconsistent results so I have now reverted to rebuilding the software from the last version that seems to always work -- about two years old. Adding back in the new code, piece at a time, is pretty slow as I'm documenting & testing each change. The system still remembers previous issues: I lowered the stack size & the OpenMP dropped to single thread. I put it back as it was but it still stayed as a single thread. Going back to earlier versions & then returning reset the OpenMP functionality. I'm now using RamMap to empty the memory when I need to recover from changes; so far, it seems to be working. If this solves the persistence problem, I suppose it means that either MS or Intel has some sorta problem with remembering somethings they should not. Yes, I clean the solution before rebuilding but that does not seem to help.
mecej4: I greatly appreciate the offer. There is nothing secret about the code but there seem to be several problems running around in it & I'm trying to reduce the issues to a few, repeatable problems. Trying to fix one seems to uncover more gremlins. It is the lack of repeatability that is driving me nuts. I hope to rebuild the code to narrow down the specific issues (other than repeatability).
Bruce Weaver wrote:
... I'm now using RamMap to empty the memory when I need to recover from changes; so far, it seems to be working. If this solves the persistence problem, I suppose it means that either MS or Intel has some sorta problem with remembering somethings they should not.
I cannot agree with that tentative conclusion. It is the job of RAM to remember everything until something else is written into it or power is turned off! It is the user program's responsibility to write to any memory before reading that memory (directly in initializations or by assignment, or through a call to a user or a library routine, or by READ).
From my personal experience, Bruce, every time I thought there was a problem with the compiler it turned out to be my error. Your problem will almost certainly turn out to be a subtle coding error. The hardest to track down are Heisenbugs, which change their behaviour when you try to look at them. These are always a manifestation of memory being accessed by mistake, and usually the cause is remote from the place where the error surfaces.
I like the Heisenbugs name. It has been decades since I had such a bug but it seems to be just that kind. The VS debugger was of no use as it bounced me all over the code. I am currently considering using a NAG compiler, which has a better performance in error checking, to find the problem(s). The idea is to use two compilers: one for code checking and another for making efficient code (a very big issue for code that takes days to run with 20+ threads). I worry about how easy it will be to move code between the two compilers. Does anyone have experience with this approach? or the NAG compiler?
Bruce Weaver wrote:
I worry about how easy it will be to move code between the two compilers. Does anyone have experience with this approach? or the NAG compiler?
That depends on how the code is written. If it is standard Fortran, all that you have to do is to construct two makefiles, no porting of source code necessary. Furthermore, seemingly innocuous changes to the source code can cause the bugs to hide or disappear, particularly so if the bug in question is in the compiler itself.
Actually gfortran & g95 are not as good as catching errors as Intel Fortran but all about 50%. NAG & Silverfrost ftn95 come in at over 90% in this regard. See https://www.fortran.uk/fortran-compiler-comparisons/win32-fortran-compiler-comparisons-diagnostic-ca...
Intel is particularly bad at detecting uninitialized errors, which this forum thread has already exposed with the test program provided by mecej4: klobber. Perhaps the Intel developers should address this issue. I need to use Intel because of the speed, see :
but I sure wish they would make the debug mode much better at error detection. I've used almost all the the compilers on the Polyhedron site at one time or another & I'm fond of the Intel one (&especially Amplifier) but this error detection issue is a problem.
If I could summarise what I have read from all these 56 posts:
There are a number of possibilities for the problems (plural) being reported in this thread.
Problems plural appears to be a key theme.
Two main problems being reported appear to be:
1) the program won't start
2) the program gives inconsistent results or crashes (it must have started?)
For Problem 1:
The likely explanation is .exe is using more than 2gb, although isn't this a problem of a 32 bit .exe ?
Or, the stack is made too big. Wouldn't this issue explode with OpenMP, which replicates the stack for each thread.
(I'm not sure of the stack size allotted for other threads in ifort?)
A solution approach can be to change large arrays to allocatable and place them in a module. This addresses both large common arrays and large stack demands.
Note also that all OpenMP private arrays will be replicated, which in some cases can have heavy stack demands.
An alternative could be to make a large private array into a (larger!) shared allocatable array, with the last subscript being the thread id. (need 64-bit .exe for this)
For Problem 2:
The results are not reproduceable. This is to be expected with OpenMP.
Suggestions are variables are not being initialised, which is more difficult with OpenMP if Firstprivate clause is not correctly used.
There has not been a suggestion of race-condition, which could be a clear possibility.
Also, make sure all routines are threadsafe, with no local static variables that need to be private.
As has been recommended, first try to "get a single thread version running". This looks to be a good start.
The attempt to change all large arrays to allocatable could also help.
Audit all static arrays and variables for possible multi-thread use, including in all libraries and utility routines. Making module arrays private adds to the complexity.
Using a different compiler with better (even different) error checking can help.
Bruce Weaver wrote:
... Intel is particularly bad at detecting uninitialized errors, .... Perhaps the Intel developers should address this issue. I need to use Intel because of the speed ... but I sure wish they would make the debug mode much better at error detection.
It is certainly feasible to use one compiler for finding and fixing bugs, and another for speeding up the fixed program. Many people use more than one compiler, using each for the task that it is good at.
The problem of not finding the .exe files seems to be abated by some of the earlier suggestions. They are still occurring on my co-worker's machine so I will have to try to duplicate whatever the fix(es) is/are. The .exe files are only about 0.5MB. All the large arrays are shared and amount to about 400MB in the test case but I've used as much as 16GB of files (on-disk size, about 5 GB of stored numbers) so I'm not sure how they fit in the 2GB limit. Steve? I guess that the 2GB limit does not apply to allocatable arrays.
It seems to crash as it is trying to write to disk some of the time. Restarting the program seems to help it many times. As Steve suggested I've lowered the stack size (initially raised to the limit based on previous discussions on the Forum). The data arrays are allocated in modules ...does this really help the stack/heap size issue? I have run as many as 24 threads with large data sets. There are no significant private arrays. The results are extremely statistical and are quite duplicatable w/ OpenMP. I have not checked about threadsafe in a while so I should do that.
There are only two versions of the answer: one that is correct and one which is very similar but is wrong. I think the uninitialized variable/array suggestion is the most likely at the moment.
Another symptom cropped up yesterday. One version, which involves a large array causes OpenMP to drop down to one (or, weirdly, two) threads. The same exact code employs all the requested threads on the co-worker's computer. I have not had a chance to fully explore that but it gave the wrong version of the answer in the only run we tested. We were busy beating the code for other of these symptoms that I've been reporting here. My co-worker is using VS and Intel 2013 as opposed to my updated 2019 and VS 2015.