- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm playing with different job launching methods (http://slurm.schedmd.com/mpi_guide.html#intel_mpi), and getting the following error only when I launch a job using srun (my code works fine with mpirun, mpirun --bootstrp=slurm, and mpiexec.hyra) AND using shm:dapl (works fine with shm:tcp).
If I launch the job with
setenv I_MPI_PMI_LIBRARY /usr/lib64/libpmi.so
setenv I_MPI_FABRICS shm:dapl
srun -n 2 my_exec
I get
1: [1] trying to free memory block that is currently involved to uncompleted data transfer operation
1: free mem - addr=0x2b7a44547f70 len=1146388320
1: RTC entry - addr=0x2b7a4bc93a00 len=1254064 cnt=1
1: Assertion failed in file ../../i_rtc_cache.c at line 1338: 0
1: internal ABORT - process 1
0: [0] trying to free memory block that is currently involved to uncompleted data transfer operation
0: free mem - addr=0x2ab3a253ff90 len=2723413888
0: RTC entry - addr=0x2ab3a7aada80 len=1182864 cnt=1
0: Assertion failed in file ../../i_rtc_cache.c at line 1338: 0
0: internal ABORT - process 0
And this error disappears if I set I_MPI_FABRICS to shm:tcp
So what's the difference between srun and other launching methods in this regard? I want to make sure whether this can happen due to a bug in my code (so I need to fix it) or this is just a configuration issue and just not using srun will be sufficient.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Seunghwa,
Thanks for getting in touch. This is more likely a configuration error than an issue with your application. Although, it's likely your application takes up more memory than the defaults allow.
In your case, you're saying that using I_MPI_FABRICS=shm:dapl (which is running over your local InfiniBand software stack, likely OFED) works fine when doing mpirun, mpirun -bootstrap=slurm, and mpiexec.hydra. But doing the same with srun causes the "trying to free memory block" errors you see.
The main difference in all of these cases is the launch mechanism. When using mpirun/mpiexec.hydra, you're relying on the Intel MPI Library to start your job using the underlying SLURM startup method. But in the srun case, you're actually asking SLURM to start your MPI job by pulling in the appropriate Intel MPI libs and scripts. So the issue with srun is that some of the defaults on your system might be set differently as compared to when starting up with mpirun.
Do you know if your memory limits are set appropriately? Check out this forum thread which talks about how to set some of these limits. Furthermore, the same error was resolved here by setting log_num_mtt to 24.
I hope this helps. Let me know if updating your settings changes the outcome.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Gergana,
The problem disappeared once I removed the "setenv I_MPI_PMI_LIBRARY /usr/lib64/libpmi.so" line and executed with "srun --mpi=pmi2 ..." instead of "srun ...".
For OpenMPI, it seems like --mpi=pmi2 should be used if pmi2 is enabled. Is there something similar for Intel MPI?
"If the pmi2 support is enabled then the command line options '--mpi=pmi2' has to be specified on the srun command line." <= from http://slurm.schedmd.com/mpi_guide.html#open_mpi
And I am encountering another problem.
With srun --mpi=pmi2 and 128 or more nodes (1 MPI process per node, no error message till 64 nodes),
I got "slurmstepd: error: tree_msg_to_stepds: host=g161, rc = 1" in MPI_Init_thread(), but the code seems like working fine. With mpirun or mpiexec, MPI_Init_thread() does not return any error message, but MPI communication is way slower.
Any idea?
Thank you very much!!!
-seunghwa
Gergana S. (Intel) wrote:
Hi Seunghwa,
Thanks for getting in touch. This is more likely a configuration error than an issue with your application. Although, it's likely your application takes up more memory than the defaults allow.
In your case, you're saying that using I_MPI_FABRICS=shm:dapl (which is running over your local InfiniBand software stack, likely OFED) works fine when doing mpirun, mpirun -bootstrap=slurm, and mpiexec.hydra. But doing the same with srun causes the "trying to free memory block" errors you see.
The main difference in all of these cases is the launch mechanism. When using mpirun/mpiexec.hydra, you're relying on the Intel MPI Library to start your job using the underlying SLURM startup method. But in the srun case, you're actually asking SLURM to start your MPI job by pulling in the appropriate Intel MPI libs and scripts. So the issue with srun is that some of the defaults on your system might be set differently as compared to when starting up with mpirun.
Do you know if your memory limits are set appropriately? Check out this forum thread which talks about how to set some of these limits. Furthermore, the same error was resolved here by setting log_num_mtt to 24.
I hope this helps. Let me know if updating your settings changes the outcome.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This turned out to be an issue at the system I am using and now it's fixed.
Thanks for the support!
Seunghwa Kang wrote:
Thanks Gergana,
The problem disappeared once I removed the "setenv I_MPI_PMI_LIBRARY /usr/lib64/libpmi.so" line and executed with "srun --mpi=pmi2 ..." instead of "srun ...".
For OpenMPI, it seems like --mpi=pmi2 should be used if pmi2 is enabled. Is there something similar for Intel MPI?
"If the pmi2 support is enabled then the command line options '--mpi=pmi2' has to be specified on the srun command line." <= from http://slurm.schedmd.com/mpi_guide.html#open_mpi
And I am encountering another problem.
With srun --mpi=pmi2 and 128 or more nodes (1 MPI process per node, no error message till 64 nodes),
I got "slurmstepd: error: tree_msg_to_stepds: host=g161, rc = 1" in MPI_Init_thread(), but the code seems like working fine. With mpirun or mpiexec, MPI_Init_thread() does not return any error message, but MPI communication is way slower.
Any idea?
Thank you very much!!!
-seunghwa
Quote:
Gergana S. (Intel) wrote:
Hi Seunghwa,
Thanks for getting in touch. This is more likely a configuration error than an issue with your application. Although, it's likely your application takes up more memory than the defaults allow.
In your case, you're saying that using I_MPI_FABRICS=shm:dapl (which is running over your local InfiniBand software stack, likely OFED) works fine when doing mpirun, mpirun -bootstrap=slurm, and mpiexec.hydra. But doing the same with srun causes the "trying to free memory block" errors you see.
The main difference in all of these cases is the launch mechanism. When using mpirun/mpiexec.hydra, you're relying on the Intel MPI Library to start your job using the underlying SLURM startup method. But in the srun case, you're actually asking SLURM to start your MPI job by pulling in the appropriate Intel MPI libs and scripts. So the issue with srun is that some of the defaults on your system might be set differently as compared to when starting up with mpirun.
Do you know if your memory limits are set appropriately? Check out this forum thread which talks about how to set some of these limits. Furthermore, the same error was resolved here by setting log_num_mtt to 24.
I hope this helps. Let me know if updating your settings changes the outcome.
Regards,
~Gergana
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page