- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello. I have been on a wild goose chase trying to track down a multiplicity of mutating errors in my code using MPICH2 on a quad core Intel box. These are Heisenbugs with a vengeance. I see things like access violation, stack overflow, array index out of range, and side effects from subroutine calls changing values of local variables. I'm using allocatable arrays quite widely, but as far as I can tell making these static doesn't improve matters, although it changes them unpredictably. I get different results with debug and release versions, and also changing the compiler optimization switch often leads to different results. At times I've thought all was working correctly, but then adding or removing a write statement showed me I was wrong.
In 99.99% of previous situations like this the fault has turned out to be mine, but I've been over my code exhaustively and I'm starting to wonder if it could be a problem with MPICH. I'd like to know of anyone else's experience with MPICH2. Failing that, any suggestions about how I might track down the source of these problems? I have a sneaking suspicion that it could be stack-related, but activating the runtime stack checking doesn't show anything. Thanks.
Gib
In 99.99% of previous situations like this the fault has turned out to be mine, but I've been over my code exhaustively and I'm starting to wonder if it could be a problem with MPICH. I'd like to know of anyone else's experience with MPICH2. Failing that, any suggestions about how I might track down the source of these problems? I have a sneaking suspicion that it could be stack-related, but activating the runtime stack checking doesn't show anything. Thanks.
Gib
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One of the issues with MPI is that the array or variables you pass may get read or written after the call returns. In such cases, at present, you must declare those arrays VOLATILE. There is a proposal in progress to enhance the Fortran standard to allow ASYNCHRONOUS to apply better to such calls, but right now it has some limitations.
See if adding VOLATILE to arrays you pass to MPI send and receive calls helps.
See if adding VOLATILE to arrays you pass to MPI send and receive calls helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That sounds very promising, Steve, since it has occurred to me that slowing execution down with write statements seems to help. But this information surprises me greatly, since it means that you can't be sure about when the data transfer has completed. Isn't this a major problem?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ah, my unfamiliarity with MPI is showing. Normally, MPI_SEND and MPI_RECV block until complete. But it looks as if a non-blocking send/receive is being proposed.
In any event, your symptoms sound like data corruption. Are you sure that the lengths specified on MPI_RECV calls are correct? You may want to download a trial version of Intel Trace Analyzer and Collector to see if it can spot errors.
In any event, your symptoms sound like data corruption. Are you sure that the lengths specified on MPI_RECV calls are correct? You may want to download a trial version of Intel Trace Analyzer and Collector to see if it can spot errors.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You raise my hopes, only to dash them! I do seem to have some sort of data corruption, but most of the time it isn't picked up by the bounds checking, and the instances that it does detect are spurious.
Thanks for the suggestion of trying the Trace Analyzer, I will do this.
Gib
Thanks for the suggestion of trying the Trace Analyzer, I will do this.
Gib

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page