- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
I have installed the Intel Cluster Studio XE 2012 for Windows (file "w_ics_2012.0.033.exe") using Evaluation license file received from Intel, but I cant evaluate the cluster work of Fortran in the Properties of my new project (RClick --> Properties --> Configuration Properties --> Fortran --> Language --> Enable Coarrays) I don't see the option for Distributed Memory (/Qcoarray:distributed) only "No" and "For Shared Memory (/Qcoarray:shared)" for both Win32 and x64 solution platforms.
My cluster system consists of 2 computers:
1) Head node: Windows Server 2008 R2 with SP1 + HPC Pack 2008 R2 with SP3 + Visual Studio 2010 with SP1;
2) Workstation node: Windows 7 (x64) with SP1 + HPC Pack 2008 R2 with SP3.
The Intel Cluster Studio was being installed on the head node, but automatically it was installed on the workstation node too.
If I insert the /Qcoarray:distributed option manually (RClick --> Properties --> Configuration Properties --> Fortran --> Command Line --> Additional Options: /Qcoarray:distributed), a test program works on the head node only, although the corresponding machines.Windows file (the environment (system) variable FOR_COARRAY_MACHINEFILE is assigned) has 2 lines with the computer node names.
The result of command "clusrun smpd status" is
----- Summary ----
2 Nodes succeeded
0 Nodes failed
What is wrong and what should I do to see the "/Qcoarray:distributed" option?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you still need help, please post in the Windows Fortran forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your advice, but I think my question concerns the Cluster Studio environment/installer or/and integration of the Cluster Studio into the Visual Studio rather than the Fortran compiler the compiler is working well.
In addition to the integration: if I assign a file name in the MPI Configuration File option (RClick --> Properties --> Configuration Properties --> Fortran --> Language -->MPI Configuration File), for example, MPIConfigFile, the compiler looks for MPIConfigFile\\Node0\CcpSpoolDir\Coar1\x64\Debug\Coar1.exe, where \\Node0\CcpSpoolDir\ is a shared directory on the head node accessible to the workstation node, and \\Node0\CcpSpoolDir\Coar1\x64\Debug\ is the correct executable file (Coar1.exe) path. The result of compilation: Can't open config file MPIConfigFile\\Node0\CcpSpoolDir\Coar1\x64\Debug\Coar1.exe: No such file or directory.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would urge you to try the experiments from the command line as described in the article I pointed to. I'm not sure how thoroughly we tested the VS integration for distributed coarray support. I will ask our developer who worked most with this support to read this thread and see what she can suggest.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Of course I had read that article before I asked the question here (corresponding link to the article is in the documentation).
Below are the results of some experiments with the "coarray_samples" sample included in the software. WinSer2008R2 is the head node, Win7 is the workstation node. The "Additional Options" for the compiler: /Qcoarray:distributed /Qcoarray-num-images:8
1) Start from the Visual Studio: Debug --> Start Debugging
Result: task hangs.
Ctrl^C gives:
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: Win7: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: WinSer2008R2.mynet.dom: 123
5: Win7: 123
6: WinSer2008R2.mynet.dom: 123
7: Win7: 123
2) Command: mpiexec -host WinSer2008R2 -n 3 -genv FOR_ICAF_STATUS launched -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\coarray_samples\x64\Debug\coarray_samples.exe
Result: OK:
[1#6880:5060@WinSer2008R2] MPI startup(): shm data transfer mode
[2#2040:6184@WinSer2008R2] MPI startup(): shm data transfer mode
[0#5100:7192@WinSer2008R2] MPI startup(): shm data transfer mode
[2#2040:6184@WinSer2008R2] MPI startup(): process is pinned to CPU02 on node WinSer2008R2
[0#5100:7192@WinSer2008R2] MPI startup(): process is pinned to CPU00 on node WinSer2008R2
[1#6880:5060@WinSer2008R2] MPI startup(): process is pinned to CPU01 on node WinSer2008R2
[0#5100:7192@WinSer2008R2] Rank Pid Node name Pin cpu
[0#5100:7192@WinSer2008R2] 0 5100 WinSer2008R2 0
[0#5100:7192@WinSer2008R2] 1 6880 WinSer2008R2 1
[0#5100:7192@WinSer2008R2] 2 2040 WinSer2008R2 2
[0#5100:7192@WinSer2008R2] MPI startup(): I_MPI_DEBUG=+5
[0#5100:7192@WinSer2008R2] MPI startup(): NUMBER_OF_PROCESSORS=4
[0#5100:7192@WinSer2008R2] MPI startup(): PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 23 Stepping 7, GenuineIntel
Hello from image 2 out of 3 total images
Hello from image 1 out of 3 total images
Hello from image 3 out of 3 total images
3) Command:mpiexec -host Win7 -n 3 -genv FOR_ICAF_STATUS launched -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\coarray_samples\x64\Debug\coarray_samples.exe
Result: OK:
[2#14200:10520@Win7] MPI startup(): shm data transfer mode
[0#14816:3836@Win7] MPI startup(): shm data transfer mode
[1#11572:4816@Win7] MPI startup(): shm data transfer mode
[2#14200:10520@Win7] MPI startup(): set domain to {4,5} on node Win7
[0#14816:3836@Win7] MPI startup(): set domain to {0,1} on node Win7
[1#11572:4816@Win7] MPI startup(): set domain to {2,3} on node Win7
[0#14816:3836@Win7] RankPid Node name Pin cpu
[0#14816:3836@Win7] 014816 Win7 {0,1}
[0#14816:3836@Win7] 111572 Win7 {2,3}
[0#14816:3836@Win7] 214200 Win7 {4,5}
[0#14816:3836@Win7] MPI startup(): I_MPI_DEBUG=+5
[0#14816:3836@Win7] MPI startup(): NUMBER_OF_PROCESSORS=8
[0#14816:3836@Win7] MPI startup(): PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
Hello from image 2 out of 3 total images
Hello from image 3 out of 3 total images
Hello from image 1 out of 3 total images
4) Command:mpiexec -hosts 2 WinSer2008R2 3 Win7 3 -genv FOR_ICAF_STATUS launched -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\coarray_samples\x64\Debug\coarray_samples.exe
Result: task hangs.
Ctrl^C gives:
[0#7356:5672@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[1#7588:7328@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[2#7752:7672@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[3#15240:15828@Win7] MPI startup(): shm and tcp data transfer modes
[5#13376:14660@Win7] MPI startup(): shm and tcp data transfer modes
[4#13488:13232@Win7] MPI startup(): shm and tcp data transfer modes
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: WinSer2008R2.mynet.dom: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: Win7: 123
5: Win7: 123
As it seems from item 4, I have some problems with tcp. What should I need to check and adjust?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just wanted to let you know that Steve pointed me at this thread; I was the lucky developer who hooked
up DCAF on Windows.
I have reproduced the situation you've found, and am looking at how to resolve it. Note, I've reproduced it in a straight mpiprogram; no coarrays to be seen, so that complication is removed.
You did put this in the right forum; there are some really good people here, and actually, you might see a question I post too, looking for help to resolve this.
As an aside,I can use a machinefile if there is only one node in the file; it doesn't have to be this current node, so yeah, I have to agree with you that there is an interesting configuration issue.
By the way; this link (also in this forum)has some interesting info:
http://software.intel.com/en-us/forums/showthread.php?t=81922
I'll post more as I learn more ---
Thanks for using the Windows DCAF -
--Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lorri,
Thank you for your time.
I have tried the straight mpi program too (although it is not a subject of this thread) test.f90 included in the software but the result shown below is the same: two nodes (item 3) do not work together.
1) Command: mpiexec -host WinSer2008R2 -n 3 -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\test\x64\Debug\test.exe
Result: OK:
[1#5168:2528@WinSer2008R2] MPI startup(): shm data transfer mode
[0#5612:2732@WinSer2008R2] MPI startup(): shm data transfer mode
[2#5744:3288@WinSer2008R2] MPI startup(): shm data transfer mode
[2#5744:3288@WinSer2008R2] MPI startup(): Internal info: pinning initialization was done
[0#5612:2732@WinSer2008R2] MPI startup(): Internal info: pinning initialization was done
[1#5168:2528@WinSer2008R2] MPI startup(): Internal info: pinning initialization was done
[0#5612:2732@WinSer2008R2] MPI startup(): Rank Pid Node name Pin cpu
[0#5612:2732@WinSer2008R2] MPI startup(): 0 5612 WinSer2008R2 0
[0#5612:2732@WinSer2008R2] MPI startup(): 1 5168 WinSer2008R2 1
[0#5612:2732@WinSer2008R2] MPI startup(): 2 5744 WinSer2008R2 2
[0#5612:2732@WinSer2008R2] MPI startup(): I_MPI_DEBUG=+5
[0#5612:2732@WinSer2008R2] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 1,2 2
[0#5612:2732@WinSer2008R2] MPI startup(): PMI_RANK=0
Hello world: rank 0 of 3 running on
WinSer2008R2.mynet.dom
Hello world: rank 1 of 3 running on
WinSer2008R2.mynet.dom
Hello world: rank 2 of 3 running on
WinSer2008R2.mynet.dom
2) Command: mpiexec -host Win7 -n 3 -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\test\x64\Debug\test.exe
Result: OK:
[1#11724:10472@Win7] MPI startup(): shm data transfer mode
[0#11556:5968@Win7] MPI startup(): shm data transfer mode
[2#9576:1500@Win7] MPI startup(): shm data transfer mode
[1#11724:10472@Win7] MPI startup(): Internal info: pinning initialization was done
[0#11556:5968@Win7] MPI startup(): Internal info: pinning initialization was done
[2#9576:1500@Win7] MPI startup(): Internal info: pinning initialization was done
[0#11556:5968@Win7] MPI startup(): Rank Pid Node name Pin cpu
[0#11556:5968@Win7] MPI startup(): 0 11556 Win7 {0,1}
[0#11556:5968@Win7] MPI startup(): 1 11724 Win7 {2,3}
[0#11556:5968@Win7] MPI startup(): 2 9576 Win7 {4,5}
[0#11556:5968@Win7] MPI startup(): I_MPI_DEBUG=+5
[0#11556:5968@Win7] MPI startup(): I_MPI_PIN_MAPPING=3:0 0,1 2,2 4
[0#11556:5968@Win7] MPI startup(): PMI_RANK=0
Hello world: rank 0 of 3 running on
Win7.mynet.dom
Hello world: rank 1 of 3 running on
Win7.mynet.dom
Hello world: rank 2 of 3 running on
Win7.mynet.dom
3) Command: mpiexec -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\test\x64\Debug\test.exe
Result: task hangs.
Ctrl^C gives:
[2#4792:4804@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[0#3356:3696@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[1#5956:6004@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[4#10340:5736@Win7] MPI startup(): shm and tcp data transfer modes
[5#9228:9816@Win7] MPI startup(): shm and tcp data transfer modes
[3#11112:8964@Win7] MPI startup(): shm and tcp data transfer modes
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: WinSer2008R2.mynet.dom: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: Win7: 123
5: Win7: 123
In accordance with the advice in http://software.intel.com/en-us/forums/showthread.php?t=81922 you had referred to, I used -genv I_MPI_PLATFORM 0, and added the DNS suffix to the node names in the mpiexec command it did not help.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've seen similar behavior while doing some testing for a different issue. Could you try running the following commands?
[plain]mpiexec -genvnone -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 hostname mpiexec -genvnone -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 \WinSer2008R2CcpSpoolDirtestx64Debugtest.exe [/plain]Adding -genvnone is a quick check to prevent copying the environment variables from one system to another. If the MPI installations are in different locations on each computer, the environment variables from one will prevent it from being located on the other. See the thread http://software.intel.com/en-us/forums/showthread.php?t=85990&o=a&s=lr for more detail on the mismatch.
The first command will just insure that you can run across multiple hosts simultaneously. The second will insure that the processes can communicate with each other. Please let me know what happens from these commands.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi James,
Thank you for your advice.
The results of the commands:
1.mpiexec -genvnone -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 hostname
Result:
WinSer2008R2
WinSer2008R2
WinSer2008R2
Win7
Win7
Win7
2. mpiexec -genvnone -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\test\x64\Debug\test.exe
Result: task hangs.
Ctrl^C gives:
[2#1052:4080@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[0#780:3824@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[1#4788:4988@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[3#12960:13116@Win7] MPI startup(): shm and tcp data transfer modes
[5#3604:9880@Win7] MPI startup(): shm and tcp data transfer modes
[4#12716:10456@Win7] MPI startup(): shm and tcp data transfer modes
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: WinSer2008R2.mynet.dom: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: Win7: 123
5: Win7: 123
The MPIlocation is "c:\Program Files (x86)\Intel\MPI" on each node.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I believe that the problem you are experiencing is due to your firewall. As one more check, please allow the program test.exe through your firewall on both computers, and try running the second command again. You can leave off the -genvnone option, it should have no effect here.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi James,
The firewall rule for the program was Enabled on WinSer2008R2. Adding the similar rule on Win7 changed the output but didn't change the result.
Command: mpiexec -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\test\x64\Debug\test.exe
Result: task hangs.
Ctrl^C gives:
[0#5924:6008@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[2#288:1144@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[1#5596:3536@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[4#3768:5408@Win7] MPI startup(): shm and tcp data transfer modes
[5#4036:736@Win7] MPI startup(): shm and tcp data transfer modes
[3#1236:4204@Win7] MPI startup(): shm and tcp data transfer modes
[2#288:1144@WinSer2008R2] MPI startup(): Internal info: pinning initialization was done
[4#3768:5408@Win7] MPI startup(): Internal info: pinning initialization was done
[0#5924:6008@WinSer2008R2] MPI startup(): Internal info: pinning initialization was done
[1#5596:3536@WinSer2008R2] MPI startup(): Internal info: pinning initialization was done
[5#4036:736@Win7] MPI startup(): Internal info: pinning initialization was done
[3#1236:4204@Win7] MPI startup(): Internal info: pinning initialization was done
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: WinSer2008R2.mynet.dom: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: Win7: 123
5: Win7: 123
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you also have your firewalls set to allow smpd and mpiexec? Are you using the native Windows* firewall, or a different one?
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi James,
I removed-genv I_MPI_DEBUG +5 from the command and the straight mpi (test.exe) began to work (without the firewalls set to smpd and mpiexec)! Thank you very much for your previous advice about the firewall.
The behavior of the coarray_samples program (see above) changed as well, but one problem remains the program does not terminate:
1) start from VS: Debug --> Start Debugging
Result: the program prints 8 lines Hello (itswork) and hangs on both computers:
Hello from image 3 out of 8 total images
Hello from image 1 out of 8 total images
Hello from image 7 out of 8 total images
Hello from image 5 out of 8 total images
Hello from image 2 out of 8 total images
Hello from image 6 out of 8 total images
Hello from image 8 out of 8 total images
Hello from image 4 out of 8 total images
Ctrl^C gives:
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: Win7: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: WinSer2008R2.mynet.dom: 123
5: Win7: 123
6: WinSer2008R2.mynet.dom: 123
7: Win7: 123
2) command: mpiexec -hosts 2 WinSer2008R2 4 Win7 4 -genv FOR_ICAF_STATUS launched \\WinSer2008R2\CcpSpoolDir\coarray_samples\x64\Debug\coarray_samples.exe
Result: the program does its main work (8 lines Hello ) and hangs on both computers:
Hello from image 3 out of 8 total images
Hello from image 1 out of 8 total images
Hello from image 2 out of 8 total images
Hello from image 7 out of 8 total images
Hello from image 8 out of 8 total images
Hello from image 6 out of 8 total images
Hello from image 5 out of 8 total images
Hello from image 4 out of 8 total images
Ctrl^C gives:
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: WinSer2008R2.mynet.dom: 123
2: WinSer2008R2.mynet.dom: 123
3: WinSer2008R2.mynet.dom: 123
4: Win7: 123
5: Win7: 123
6: Win7: 123
7: Win7: 123
3) command: mpiexec -hosts 2 WinSer2008R2 4 Win7 4 -genv FOR_ICAF_STATUS launched -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\coarray_samples\x64\Debug\coarray_samples.exe
Result: the program does its main work (8 lines Hello ) and hangs on both computers:
[3#6788:6184@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[2#6500:4260@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[0#5300:6652@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[1#5448:5812@WinSer2008R2] MPI startup(): shm and tcp data transfer modes
[5#7796:7544@Win7] MPI startup(): shm and tcp data transfer modes
[7#7920:7668@Win7] MPI startup(): shm and tcp data transfer modes
[6#3568:6996@Win7] MPI startup(): shm and tcp data transfer modes
[4#7816:7808@Win7] MPI startup(): shm and tcp data transfer modes
[5#7796:7544@Win7] MPI startup(): set domain to {2,3} on node Win7
[6#3568:6996@Win7] MPI startup(): set domain to {4,5} on node Win7
[3#6788:6184@WinSer2008R2] MPI startup(): process is pinned to CPU03 on node WinSer2008R2
[1#5448:5812@WinSer2008R2] MPI startup(): process is pinned to CPU01 on node WinSer2008R2
[2#6500:4260@WinSer2008R2] MPI startup(): process is pinned to CPU02 on node WinSer2008R2
[0#5300:6652@WinSer2008R2] MPI startup(): process is pinned to CPU00 on node WinSer2008R2
[7#7920:7668@Win7] MPI startup(): set domain to {6,7} on node Win7
[4#7816:7808@Win7] MPI startup(): set domain to {0,1} on node Win7
[0#5300:6652@WinSer2008R2] Rank Pid Node name Pin cpu
[0#5300:6652@WinSer2008R2] 0 5300 WinSer2008R2 0
[0#5300:6652@WinSer2008R2] 1 5448 WinSer2008R2 1
[0#5300:6652@WinSer2008R2] 2 6500 WinSer2008R2 2
[0#5300:6652@WinSer2008R2] 3 6788 WinSer2008R2 3
[0#5300:6652@WinSer2008R2] 4 7816 Win7 {0,1}
[0#5300:6652@WinSer2008R2] 5 7796 {2,3}
[0#5300:6652@WinSer2008R2] 6 3568 {4,5}
[0#5300:6652@WinSer2008R2] 7 7920 {6,7}
[0#5300:6652@WinSer2008R2] MPI startup(): I_MPI_DEBUG=+5
[0#5300:6652@WinSer2008R2] MPI startup(): NUMBER_OF_PROCESSORS=4
[0#5300:6652@WinSer2008R2] MPI startup(): PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 23 Stepping 7, GenuineIntel
Hello from image 1 out of 8 total images
Hello from image 2 out of 8 total images
Hello from image 4 out of 8 total images
Hello from image 3 out of 8 total images
Hello from image 8 out of 8 total images
Hello from image 6 out of 8 total images
Hello from image 5 out of 8 total images
Hello from image 7 out of 8 total images
Ctrl^C gives:
mpiexec aborting job...
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: WinSer2008R2.mynet.dom: 123
2: WinSer2008R2.mynet.dom: 123
3: WinSer2008R2.mynet.dom: 123
4: Win7: 123
5: Win7: 123
6: Win7: 123
7: Win7: 123
The"Allow"firewall rules to smpd and mpiexec do not help. I am using the native Windows firewall.
Please let me know whatshouldI do else?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What happens if you run from the command line without mpiexec? I have not worked with coarrays before, but the sample does not run for me if I use mpiexec, it does run without it. This is only on a single computer, I will try it on multiple.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi James,
I.Below is the result of \\WinSer2008R2\CcpSpoolDir\coarray_samples\x64\Debug\coarray_samples.exe
from the command line. The program does its main work (8 lines Hello ) and hangs on both computers:
Hello from image 5 out of 8 total images
Hello from image 3 out of 8 total images
Hello from image 1 out of 8 total images
Hello from image 8 out of 8 total images
Hello from image 2 out of 8 total images
Hello from image 7 out of 8 total images
Hello from image 4 out of 8 total images
Hello from image 6 out of 8 total images
Ctrl^C gives:
mpiexec aborting job...
forrtl: error <200>: program aborting due to control-C event
In coarray image 1\nImage PCRoutine Line
Source
libifcoremdd.dll 00000000100E0407 Unknown Unknown Unknown
libifcoremdd.dll 00000000100DA252 Unknown Unknown Unknown
libifcoremdd.dll 00000000100C3261Unknown Unknown Unknown
libifcoremdd.dll 0000000010028316 Unknown Unknown Unknown
libifcoremdd.dll 000000001003BC54Unknown Unknown Unknown
kernel32.dll0000000076AA47C3Unknown Unknown Unknown
kernel32.dll0000000076A6652D Unknown Unknown Unknown
ntdll.dll0000000076CFC521Unknown Unknown Unknown
c:\Program Files\Microsoft HPC Pack 2008 R2\Data\SpoolDir\coarray_samples\x64\Debug>
job aborted:
rank: node: exit code[: error message]
0: WinSer2008R2.mynet.dom: 123: mpiexec aborting job
1: Win7: 123
2: WinSer2008R2.mynet.dom: 123
3: Win7: 123
4: WinSer2008R2.mynet.dom: 123
5: Win7: 123
6: WinSer2008R2.mynet.dom: 123
7: Win7: 123
II. About the-genv I_MPI_DEBUG +5 option in mpiexec -hosts 2 WinSer2008R2 3 Win7 3 -genv I_MPI_DEBUG +5 \\WinSer2008R2\CcpSpoolDir\test\x64\Debug\test.exe:why doesitcause the hanging up of the program on both computers?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Setting I_MPI_DEBUG to 5 should not cause a hang. This is possibly indicative of a deeper problem.What are your environment variables (just run set in a command prompt)?
As a side note, I am able torun the coarray sample program on a pair of Windows* 7 virtual machines with no problems. I did have to specifically tell the firewall to allow the coarray program, but with the firewall blocking it the program would hang at start, not at exit.
I have used two different methods for compiling and running the program. The first was
[plain]ifort /Qcoarray=distributed /Qcoarray-num-images=8 hello_image.f90 -o hello_image1.exe ifort /Qcoarray=distributed /Qcoarray-config-file=cafconfig.txt hello_image.f90 -o hello_image2.exe[/plain]
The file cafconfig.txt contained the following:
[plain]-n 8 -machinefile mpd.hosts hello_image2.exe[/plain]
And mpd.hosts contained the names of the two computers, one per line. FOR_COARRAY_MACHINEFILE was set to point to the mpd.hosts file. Both of these forms ran with no problems. Could you try compiling from the command line (just to be certain there are no stray flags causing a problem from VS)? Either of these methods should lead to the same result.
Sincerely,
JamesTullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi James,
The environment variables are set by c:\Program Files (x86)\Intel\icsxe\2012.0.033\bin\ictvars.bat. I have only added FOR_COARRAY_MACHINEFILE.
Unfortunately, I cant do without VS because "c:\Program Files (x86)\Intel\Composer XE 2011 SP1\bin\intel64\ifort" /Qcoarray=distributed /Qcoarray-num-images=8 hello_image.f90 -o hello_image.exe requires link which is in the VS directory only. So, I will continue my coarrays experiments with VS.
Thank you very much for your help.
obmeninfor- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Running ictvars.bat should automatically set the Path to include link. If it does not, try running
[bash]C:Program Files (x86)Microsoft Visual Studio 10.0VCbinamd64vcvars64.bat[/bash]
Or the equivalentfor your desired architecture target. I need to do some more testing, but attempting to compile the coarray sample program in Visual Studio* 2010with distributed coarrays does not allow me to run across multiple computers at all.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My problem with running the executable from Visual Studio* was just that, my problem. The default for the sample is to compile 32-bit, and one of my test systems only had the 64-bit runtime libraries available. Once this was corrected (compiled 64-bit within VS), everything works as expected. So this is not likely to be the cause of what you are experiencing (though it would be prudent to verify that you do have the correct runtime libraries available for each system).
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let's take a look at the SMPD now. On each of the computers, run (as Administrator) the command
[plain]smpd -traceon[/plain]
You can name the logfile whatever you want, just make it distinct for each computer. This will turn on SMPD logging. Run the coarray_samples program. When it hangs and you've killed it, run
[plain]smpd -traceoff[/plain]
to turn logging off. Attach the two files and I'll see if there's anything in there that could help diagnose what's happening.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
Edit note: edited to correct code type in first code section
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page