MPI_Comm_Connect crash - no error

gdirwin · ‎06-27-2012

We have a client-server application using MPI_Comm_connect and MPI_Comm_Accept - it works most of the time, but intermittently crashes... On the server side we call MPI_Open_port, transfer the port name to the client side, then call MPI_Comm_accept. On the client side we received the port information, then call MPI_Comm_connect.

Occasionally we find that MPI_Comm_connect on the client crashes - there is no error provided. We are unable to use try/catch loops to dig further (complicated setup using Fortran calling C++ - unresolved globals if we try to use exception handling).

We think there may be a race condition due to how the client and server apps are being started (we are not starting them with mpiexec - also a long story) - if the client calls connect before the server calls accept will this cause a crash? If so, has this been fixed (to give an error message, which we can trigger upon and retry)? Would setting an MPI error handler help to handle the error in a controlled way?

Note the following notes in MPICH2
http://lists.mcs.anl.gov/pipermail/mpich-discuss/2010-October/008162.html

which refers to this same problem and a fix...

All help appreciated - thanks!

James_T_Intel · ‎06-27-2012

Hi,

What version of the Intel MPI Library are you using? I am unable to reproduce this behavior in 4.0.3 in Windows* or Linux*. If I have a client program attempt to connect before the port is ready to accept a communication, it simply waits until the server call MPI_Comm_accept.

Try using MPI_Comm_set_errhandler(MPI_COMM_WORLD,MPI_ERRORS_RETURN) and getting the error codes returned by MPI_Comm_connect. You can then use MPI_Error_string to get a string from the error code. Let me know what that string is (or if you can't get it) and I can try to assist further.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎06-28-2012

Hi James:

Thanks for your reply... We are using V4.0.3.009 (latest download from the Intel site) for Windows 32 bit on Vista (same thing happens on Windows 7 though).

We added the ser_errhandler call you suggested (although I assumed this was the default) and we are checking the return value from MPI_Comm_connect - it never returns however - it crashes internally without any error messages or return value.

It is difficult for us to add try/catch exception handling around the call (mixed Fortran/C/C++ libraries from many sources give us headaches with linking .lib files).

The behaviour you described (ie calling accept/connect in any order and it will wait for a connection) is perfect. You mentioned you tested this - can you tell us which version of MPI you tested it with?

We are sure the port information is transfered correctly (we transfer it with our own socket connections as we could not get publish/lookup to work reliably) - it works for most of the time. We found that introducing arbitrary sleep delays helps make it more reliable (but is a hack and not a long-term solution).

Do you have any other suggestions on how we can troubleshoot/fix this?

Much appreciate your help!
Garth

James_T_Intel · ‎06-28-2012

Hi Garth,

I tested this behavior in the sameversion (4.0.3.009) in Windows*, and the corresponding version (4.0.3.008) in Linux*. Would you be able to trim your code down to a small reproducer that shows the behavior?

The default error handler causes all processes to end when one process fails. Generally, you should be shown the error message from the failing process if there is one.

I would recommend running with either I_MPI_DEBUG=5 and/or -verbose (argument to mpiexec) to get more information.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎06-28-2012

Thanks James - I will see if we can simplify this down to a simple C++ app for testing.

We do not see any error messages on either side - the process calling connect crashes internally, so we cannot debug or see what is happening.

Regarding running with I_MPI_DEBUG - we do not start the applications with mpiexec - we start each process ourself, communicate the port info via our own TCP sockets, then call connect/accept... Is there any other way of getting some debug information for processes that are started without using mpiexec?

Thanks!

James_T_Intel · ‎06-28-2012

Hi Garth,

Is there any way you can use mpiexec for launching? We do not support running outside of mpiexec, and that could be the source of your errors. I have been able to succesfully run my simple program without using mpiexec, however thislimits the capabilities available to the program.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎06-28-2012

Hi James:

We developed a test program (still somewhat complicated with fortran, c and c++ though) and disabled our own auto-launching routines, and tried mipexec... Each program starts, but then the Master side hangs when it tries to call open_port. Ie open_port, connect and accept are not necessary if started from mpi_exec...

We absolutely need to run without mpiexec - this is a distributed client server app where each program can get started from another, or can be launched from a GUI (to which it communicates with TCP/IP sockets). There can be numerous programs running, each on a different computer, on different/long pathnames, with different starting arguments (the arguments can often be port numbers which are allocated dynamically by another process once it has started). We switched to Intel MPI (instead of MSMPI) because it supports mpi_comm_connect and accept (why else would these routines be needed if you start everything from mpi_exec)?

Can you expand on what you mean by "limits the capabilities to the program"? Any other way of seeing where it is crashing?

Again - greatly appreciate the help...

James_T_Intel · ‎06-28-2012

Hi Garth,

The MPI_Comm_accept and MPI_Comm_connect routines can be used to join processes that were not started at the same time. For example, The test I am currently using for this capability has two programs, a server and a client. The server opens a port, writes the port name to a file, prompts the user for an integer, and waits for a connection. When that connection is made, it sends the integer value to the new process and then exits.

The client program opens the file with the port name, connects to the specified port, receives the integer, and then broadcasts it to all ranks of the client program. Each rank then displays a check value to confirm that everything worked correctly, and all processes exit.

By "limits the capabilities available to the program", there are several factors to consider. First, you can only start one process at a time without mpiexec. Second, there is no communication between the processes, mpiexec sets up the communication layers. There are environment variables set by mpiexec that are used internally by the Intel MPI Library to improvecommunication performance. Some MPI functions will not work correctly outside of mpiexec.

As an option for running programs with varying options, have you considered running with a configuration file?

[plain]mpiexec -configfile config.txt[/plain]
With config.txt containing something like:

[plain]-host host1 -n a.exe -host host2 -n b.exe [/plain]
This will allow you to run a heterogeneousset of programs and arguments within one MPI job.

Also, I_MPI_DEBUG will work outside of mpiexec, it is simply an environment variable that is read at runtime.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎07-04-2012

Hi James:

After much debugging, we simplified our program down to a single/simple cpp file, and think we have found a bug/limitation of using connect/accept (ie starting cases manually without using mpiexec).

The simplest application we could find consists of 3 programs (P1, P2 and P3) - we want P1 to establish a comm link to P2 and a second comm link from P1 to P3:
P1.exe:
- server for comm link 1 (calls open_port and accept)
- client for comm link 2 (calls connect)
P2.exe:
- client for comm link 1 (calls connect)
P3.exe:
- server for comm link 2 (calls open_port and accept)

With this configuration, the first comm link connects (ie P1 and P2), but P1.exe crashes when it calls mpi_connect (and P3.exe has already started and is waiting with an open_port/accept).

The debugger identifies the crash point - our code calls mpi_comm_connect (in P1.exe) - the crash is in write.c (from directory C:\Program Files\Microsoft Visual Studio 10.0\VC\crt\src), routine int __cdecl _write(, on line 83 (_unlock_fh(fh)) where fh is null.

If we modify P1.exe so it is the server (calling open_port and accept) for both comm links, and P2.exe and P3.exe are always clients (calling connect), then it works correctly.

Our application needs to run without using mpiexec, and has N processes which may have communication links to any or all of the other processes - it is not possible to configure it such that one process is always a server (sometimes it must be a client too) so we are stuck.

We would greatly appreciate your input here - we can privately send you the code which crashes as well...

gdirwin · ‎07-06-2012

I was able to verify that the code (that crashs on Intel MPI) works correctly on another MPI windows installation... Please let me know if you are able to debug what is happening and find a fix (again, let me know if you want me to send you the simple code example).

Thanks!

James_T_Intel · ‎07-09-2012

Hi Garth,

Would you be able to provide your reproducer code that is causing the issue? You can either post it here or send it to the email address listed in my profile. I will try to put together something mimicing the behavior you described in the meantime.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎08-10-2012

Thought I would update progress on this - we sent a test case with source code to Intel who are looking at the problem...

The problem appears to be limited to using multiple MPI_Connect and MPI_Accept calls but without starting the processes using mpiexec...

James_T_Intel · ‎08-10-2012

Hi Garth,

Not quite. The problem also appears when using mpiexec.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎08-21-2012

Hi James:

We are really desparate to try out Intel MPI for our application, yet the crash is stopping us cold with no workarounds...

Is anything being done? Earlier you mentioned a higher level of support (Premier) - would this help?

What can be done to get us a fix soon?

James_T_Intel · ‎08-21-2012

Hi Garth,

Our developers are working on this problem. They are planning to have a fix implemented in our next release, which should happen within the next few months. I mentioned Premier as a secure means of getting your files to me. Generally, submitting issues is best done through Premier, but that will not change the release process.

If a fix is implemented before release, it can sometimes be released early. Have you tried adding the send/receive pair I mentioned in a previous post? That allowed the reproducer to work as expected. While not at all perfect, it could at least allow you to proceed for now.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎08-22-2012

Hi James:

We also noticed the sensitivity of the problem to random send/recvs - this fixes the reproducer but is not reliable for our bigger app (which can create dozens of connections between any configuration of cores/computers on a lan).

If it helps, the reproducer case works with Deino MPI, but fails with MPICH2 code.

Can we get a beta version sooner (so we don't have to wait for another few months)? We have been stuck on this for some time now, and another few months is a killer...

James_T_Intel · ‎08-22-2012

Hi Garth,

Regarding other MPI implementations, each implementation is unique. We cannotuse code from another implementation (unless appropriate license agreements are in place). So even if a particular capability works in another implementation, we must develop a fix on our own.

In order to receive a pre-release version (outside of official Beta versions), you would need to have a non-disclosure agreement in place with us first, then the developers can, at their choice, release an engineering build to you.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools

gdirwin · ‎08-22-2012

Hi James - we would be willing to sign a reasonable NDA if it gets us a fix ASAP - please let us know if this is worth following, and the next steps.

Thanks!

James_T_Intel · ‎08-22-2012

Hi Garth,

What is your company? I'll need to start by checking if there is an NDA in place already, and if not, I'll go from there.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools