- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Are there any plans to have a version of Intel mpi that has tight integration support for the sun gridengine queuing system much in the same way as openmpi has the support now?
Thanks
Rene
Are there any plans to have a version of Intel mpi that has tight integration support for the sun gridengine queuing system much in the same way as openmpi has the support now?
Thanks
Rene
Link Copied
13 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rene,
Yes, we consider possibility to include such functionality to our product.
Actually, I may provide you some current recommendations how to configure SGEto reach tight integration with Intel MPI Library. Just let me know if you are interesting in it.
Best regards,
Andrey
Yes, we consider possibility to include such functionality to our product.
Actually, I may provide you some current recommendations how to configure SGEto reach tight integration with Intel MPI Library. Just let me know if you are interesting in it.
Best regards,
Andrey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rene,
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Gergana Slavova (Intel)
Hi Rene,
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
Sorry was out of town for a few days and just getting back to this. Thanks Andrey and Gernana! I will look over the manual instructions and give a try and let you know how it goes.
Rene
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Gergana Slavova (Intel)
Hi Rene,
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
Gergana/Andrey,
We followed the directions on the website and setup SGE as suggested by you for tight integration with intel mpi. One of the reasons we are looking to do this is so that SGE can do proper clean up the the MPD python deamons that get left running around on servers after a job gets deleted or killed.
For example with openmpi and sge tight integration all openmpi processes get forked as children of the SGE execd deamon. So when a job gets deleted or killed SGE has full control of the job and can terminate all its openmpi children and clean up.
With intel mpi here is what I see when I submit a job:
grdadmin 4788 1 4788 4694 0 Mar30 ? 00:02:00 /hpc/SGE/bin/lx24-amd64/sge_execd
root 4789 4788 4788 4694 0 Mar30 ? 00:04:15 /bin/ksh /usr/local/bin/load.sh
grdadmin 16949 4788 16949 4694 0 09:33 ? 00:00:00 sge_shepherd-1712429 -bg
salmr0 17023 16949 17023 17023 1 09:33 ? 00:00:00 -csh /var/spool/SGE/hpcp7781/job_scripts/1712429
salmr0 17127 17023 17023 17023 0 09:33 ? 00:00:00 /bin/sh /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpirun -perhost 1 -env I
salmr0 17174 17127 17023 17023 1 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpiexec -perhost 1 -env
salmr0 17175 17174 17023 17023 1 09:33 ? 00:00:00 [sh]
.
.
.
salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7
salmr0 17176 17166 17176 17165 2 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpc
salmr0 17178 17176 17178 17165 87 09:33 ? 00:00:04 /bphpc7/vol0/salmr0/MPI-Bench/bin/x86_64/IMB-MPI1.intelmpi.3.1
As you can see my MPI job is running as a forked child of sgeexcd and it under full SGE control. However the MPDs that got started are totally independent precesses and are not forked children of SGE. The problem comes when i type qdelete or try to delete my job or kill it as it is running. At this point SGE will killl all its forked children. But it know nothing about the MPD deamos. As a result after SGE deletes, kills, and cleans up my job I still have this running around on all the nodes that ran the mpi job:
salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7
Each time i submit and delete a job I would get a new python like above hanging around. Any ideas on how to get the clean up of MPDs working properly?
Thanks
Rene
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - salmr0
Quoting - Gergana Slavova (Intel)
Hi Rene,
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
As Andrey mentioned, we do havea "manual", in a way, of how to integrate Intel MPI with Sun Grid Engine. The set of instructions are now available online at:
http://software.intel.com/en-us/articles/integrating-intel-mpi-sge/
Let us know if this helps, or if you have any questions or problems.
Regards,
~Gergana
Gergana/Andrey,
We followed the directions on the website and setup SGE as suggested by you for tight integration with intel mpi. One of the reasons we are looking to do this is so that SGE can do proper clean up the the MPD python deamons that get left running around on servers after a job gets deleted or killed.
For example with openmpi and sge tight integration all openmpi processes get forked as children of the SGE execd deamon. So when a job gets deleted or killed SGE has full control of the job and can terminate all its openmpi children and clean up.
With intel mpi here is what I see when I submit a job:
grdadmin 4788 1 4788 4694 0 Mar30 ? 00:02:00 /hpc/SGE/bin/lx24-amd64/sge_execd
root 4789 4788 4788 4694 0 Mar30 ? 00:04:15 /bin/ksh /usr/local/bin/load.sh
grdadmin 16949 4788 16949 4694 0 09:33 ? 00:00:00 sge_shepherd-1712429 -bg
salmr0 17023 16949 17023 17023 1 09:33 ? 00:00:00 -csh /var/spool/SGE/hpcp7781/job_scripts/1712429
salmr0 17127 17023 17023 17023 0 09:33 ? 00:00:00 /bin/sh /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpirun -perhost 1 -env I
salmr0 17174 17127 17023 17023 1 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpiexec -perhost 1 -env
salmr0 17175 17174 17023 17023 1 09:33 ? 00:00:00 [sh]
.
.
.
salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7
salmr0 17176 17166 17176 17165 2 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpc
salmr0 17178 17176 17178 17165 87 09:33 ? 00:00:04 /bphpc7/vol0/salmr0/MPI-Bench/bin/x86_64/IMB-MPI1.intelmpi.3.1
As you can see my MPI job is running as a forked child of sgeexcd and it under full SGE control. However the MPDs that got started are totally independent precesses and are not forked children of SGE. The problem comes when i type qdelete or try to delete my job or kill it as it is running. At this point SGE will killl all its forked children. But it know nothing about the MPD deamos. As a result after SGE deletes, kills, and cleans up my job I still have this running around on all the nodes that ran the mpi job:
salmr0 17166 1 17165 17165 0 09:33 ? 00:00:00 python /hpc/soft/intel/x86_64/ict-3.1.1/impi/3.1/bin64/mpd.py --ncpus=1 --myhost=hpcp7
Each time i submit and delete a job I would get a new python like above hanging around. Any ideas on how to get the clean up of MPDs working properly?
Thanks
Rene
Did you ever came up with a solution for this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - bleedinge
Did you ever came up with a solution for this?
I have the same problem, any solution for this problem?
thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm curious to know why Intel developed their MPI based on MPICH2/MVAPICH2. Why not based on OpenMPI?
- Sangamesh
- Sangamesh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OpenMPI was not well developed, and had not supplanted lam, at the time the decision was made, and didn't support Windows until recently. Not all subsequent developments were foreseen. Are you suggesting that cooperative developments between OpenMPI and SGE should have been foreseen? Do you know the future of SGE?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
I'm hoping this reply will reach everyone subscribed to this thread.
As a first point of business, I would suggest you give the new Intel MPI Library 4.0 a try. It came out last month and includes quite a few major changes. You can download it, if you still have a valid license, from the Intel Registration Center, or grab an eval copy from intel.com/go/mpi.
Secondly, we have plans to improve our tight integration support with SGE and other schedulers in future releases. So stay tuned.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
please have a look at:
http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html
for a tight integration with correct accounting and control of all slave tasks by SGE. The Howto was written originally for MPICH2. As Intel MPI is based on MPICH2, the "mpd startup method" also applies to Intel MPI.
-- Reuti
please have a look at:
http://gridengine.sunsource.net/howto/mpich2-integration/mpich2-integration.html
for a tight integration with correct accounting and control of all slave tasks by SGE. The Howto was written originally for MPICH2. As Intel MPI is based on MPICH2, the "mpd startup method" also applies to Intel MPI.
-- Reuti
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Reuit -- Looks like a bad link ... maybe the new gridengine.org has it?
John
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not exactly what you're looking for, but you can hack the Intel "stock" mpirun script to do a better job of tight integration. A version that I hacked together is available at:
- https://wiki.duke.edu/display/SCSC/Intel+MPI+and+SGE (click "Attachments" for the download)

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page