- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My program has now taken to stopping (without completion) without an error message. I have added some tell-tales to the code to try to locate the point of failure, but the failure moves around as I add tell-tales.
I am debugging my OMP implementation, but am setting max processors to one. I got error messages for the previous bug (record number out of range) but not for this current problem.
Is there any way to force the output of the trace stack?
Does the lack of an error message suggest anything about the nature (or location) of the problem?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I hadn't thought of stack size as a possibility. I had set stack reserve to 6,000,000, but had not specified a value for stack commit. Setting them both to 6,000,000 made no difference, however. Neither did setting them both to 6,0000,000. For that matter, I get the same results after setting them both to 6.
Ditto for using kmp_set_stacksize_s to set individual thread size to 16000K.
If I understand the (frankly murky) documentation, since most of my thread-specific data is defined in threadprivate common blocks, that wouldn't be placed on the stack anyway. Or should I be setting some reserve values for the heap?
I don't understand your suggestion about linking with the DLL libraries -- which libraries and specified exactly where?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alternatively, try adding this under Fortran > Command line > Additional Options: /Qopenmp-link:static
I'm just grasping at straws here. But I have seen programs "just exit" when the stack has been severely corrupted.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alternatively, try adding this under Fortran > Command line > Additional Options: /Qopenmp-link:static
I'm just grasping at straws here. But I have seen programs "just exit" when the stack has been severely corrupted.
Well, changing to multithreaded DLL certainly provoked a change: "The system cannot execute the specified program." Oh my.
And the link:static alternative returned the program to its previous error pattern.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve,
The program has many arrays in Common, but I don't know as you'd consider them large.
I have compiled without the OMP translation, and I'm getting error messages, although they still seem a bit squirrely. I think part of what is happening is that the program is more sensitive to memory allocations than before. (Heck, even now the program does no runtime memory allocation -- it's rather staid in that way.)
But I did trace one FP overflow error. In the routines lying within the OMP region, I had dutifully moved all locally preset data to shared common areas where the values would be available to all threads. It turns out that in the original implementation, one of the variables was spelled differently (DRIVDEN vs. DRIVEDEN) in its declaration and in the body of the procedure. This worked before because the compiler assumed implicit declaration, local variables were SAVEd,and everything was jake. Now, even with OMP not active, the compiler is no longer instructed to SAVE locals, and the value on entry to the procedure was, eventually, big enough to produce an overflow. At least that's what I'm telling myself.
But I'm still getting the curious affect where adding tell-tales changes the results, usually shooting right by previous error points.
But a question: do I understand correctly that each time I enter a routine, the local variables allocated to the stack will have different values? (In one case, the errmsg identifies an invalid FP at a tell-tale where I'm printing out the value of a variable that hadn't yet been set by the routine.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I've spent the interim cleaning up a few unitialized variables in builds with the OMP translation disabled. (It occurred to me that I should ensure that the changes to data and control mechanisms hadn't been the problem.)
Compiling with the OMP enabled, the program runs with a single thread. (Number of threads is controlled bya run-time input to the program.) Running with two threads, the program gets barely started before it dies -- again without an error message. The stack settings were: reserve = 6000000; commit = 0; kmp_set_stacksize_s(16000).
I increased kmp_set_stacksize_s to (16000000) and Shazam! I've got error messages again. In fact, "(157) Program Exception - access violation" runs right off the top of the window! These messages are followed by a single "(170) Program Exception - stack overflow" statement.
Reading the tattletales, I find that thread0 died shortly after thread1 was initialized. Thread0 put out a standard tt message, thread1 did the same, and the next message from thread0 indicates that a crucialvariable (located in threadprivate common) is set at the bad value of zero.In getting to this error message, thread0 passed over atattle tale message conditioned on the same flag (also in a threadprivate common) that prompted the previous tt msg. The bottom line here being that thread0 lost its connection to its threadprivate data.
So, is this a step forward, or a step back? And where do I step next?
I suppose that the stack overflow message came after the earlier access violations. In the hope that the first message might have more specific data (fault address), how do I set up the runtime environment to copy the screen output to a file? (Yes, I've seen the documentation on qdiag, but it seems to be for compiler diagnostic messages.)
I imagine that perhaps the stack overflow message might be generated in response to the error handling. Is that a reasonable guess?
And are there any glaring problems with the stack settings I'm using -- coming long ago from a mainframe background, I've never been comfortable with stack architecture. Should I try and stuff stuff on the heap?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page