- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a program (myprogram.exe) that I run with Intel MPI through a queue system (Slurm). This program runs perfectly in some of the nodes, and it does nothing on the rest of the nodes: no program output, no MPI initialization output. I've checked process running when I launch it on different nodes. In the nodes where it works, I can see only one "pmi_proxy" process. In the nodes where it doesn't work I can see (ntasks +1) pmi_proxy process, where ntasks is the number of tasks for mpirun. In the nodes where it works after starting the pmi_proxy, the n myprogram.exe. start. In the nodes where it doesn't work after starting the (ntasks +1) pmi_proxy it blocks.
Any suggestion?
Regards,
Miguel
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Miguel,
That sounds like mpirun isn't working correctly with SLURM*. Please run again with
[plain]-verbose -genv I_MPI_DEBUG 5[/plain]
and attach the output as a file (or send it to me directly if you'd prefer it stay private).
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page