- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I launch the parallel program by Intel MPI 4.1.3.047. I launched 240 processes in 10 calculate nodes. Everytime I launched the program, The execute files are launched in every node after 2-3 seconds. However, the CPU usage of every process is 0% and the program is waiting for something. The wait time could reach to 10 minutes. To further check the location of waiting. I have the following test code in my program:
program test ! define variables write(*,*)1 call MPI_Init ( ierr ) write(*,*)2 comm = MPI_COMM_WORLD call MPI_COMM_SIZE (comm, mysize, ierr) write(*,*)3 call MPI_COMM_RANK (comm, myid, ierr) if(myid==0)write(*,*)4 ... end
It seems that the number 1 was printed soon (about 2-3 seconds after launched the program). However, it will wait for about 10 minutes the number 2 be printed. So my problem is: what lead to the MPI_Init take so long time?
Thanks,
Zhanghong Tang
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I am still trying to solve the problem. Forgot to say, I have created a domain in my cluster and 10 nodes are included in the domain, they are N01, N02,..., N10. The IP address of these nodes are:
10.0.0.5
10.0.0.2
10.0.0.3
10.0.0.4
10.0.0.1
10.0.0.6
10.0.0.7
10.0.0.8
10.0.0.9
10.0.0.10
I installed Windows 2012 HPC on N05 (10.0.0.1) and N06 (10.0.0.6) and the head node is N05. By further test I found that if I launch processes without N05, i.e., the head node, the processes begin very fast (about 3 seconds after I entered the command line). but if the head node is launched, the wait time is more than 10 minutes. What could lead to this problem?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You use pretty old version of Intel MPI Library 4.1.3 - is it possible for you to switch to the latest one?
Possibly this delay was caused by some specific network settings - check the connections between the compute nodes (for example with ping utility).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Artem,
Thank you very much for your kindly reply. I have tested many times and found latest version doesn't work. Only the version 4.1.3.047 works for me. See here:
https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/644828
Could you please give more details on how to check the connections?
Thanks
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page