- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I am compiling different codes (details at the end) using the Intel Cluster Studio 2013 for Linux (C and Fortran compilers, MKL BLACS and MKL FFT3W) + Intel MPI 4.0.3.008. The programs run without problems when using one computing node, but they crash when I try to use more than one computing node.
I have gathered all the possible information from the execution and MPI calls with these options of mpirun: -v -check_mpi -genv I_MPI_DEBUG 5. The resulting information is in the attached files.
The interesting information is at the end of the files, where you can find:
from vasp.log:
[23] ERROR: LOCAL:EXIT:SIGNAL: fatal error
[23] ERROR: Fatal signal 11 (SIGSEGV) raised.
[23] ERROR: Signal was encountered at:
[23] ERROR: hamil_mp_hamiltmu_ (/home/ivasan/programas/VASP/vasp.5.3_test/vasp)
[23] ERROR: After leaving:
[23] ERROR: mpi_allreduce_(*sendbuf=0x7fff5d1ce340, *recvbuf=0x18e19c0, count=1, datatype=MPI_DOUBLE_PRECISION, op=MPI_SUM, comm=0xffffffffc4060000 CART_SUB CART_CREATE CART_SUB CART_CREATE COMM_WORLD [18:23], *ierr=0x7fff5d1ce2ac->MPI_SUCCESS)
from abinit.log:
[23] ERROR: LOCAL:MPI:CALL_FAILED: error
[23] ERROR: Null communicator.
[23] ERROR: Error occurred at:
[23] ERROR: mpi_comm_rank_(comm=MPI_COMM_NULL, *rank=0x29319b8, *ierr=0x7fff83fabb74)
[23] ERROR: initmpi_grid_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/51_manage_mpi/initmpi_grid.F90:178)
[23] ERROR: invars1_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/invars1.F90:1015)
[23] ERROR: invars1m_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/invars1m.F90:186)
[23] ERROR: m_ab6_invars_mp_ab6_invars_load_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/m_ab6_invars_f90.F90:548)
[23] ERROR: MAIN__ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/98_main/abinit.F90:260)
[23] ERROR: main (/home/ivasan/programas/abinit/abinit-6.12.3b/bin/abinit)
[23] ERROR: (/lib64/libc-2.5.so)
[23] ERROR: (/home/ivasan/programas/abinit/abinit-6.12.3b/bin/abinit)
So in both cases the problems seem to be related to MPI.
What can I do to solve these errors?
Thanks in advance for your help.
Iván
CODES:
- VASP V5.3.2 (http://www.vasp.at/). I posted this issue at the support forum: http://cms.mpi.univie.ac.at/vasp-forum/forum_viewtopic.php?3.12037
- Abinit V6.12.3 (http://www.abinit.org/). I posted this issue at the support forum: http://forum.abinit.org/viewtopic.php?f=3&t=1851
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Somanath,
Just one quick test: can you login to one of the computing nodes an execute the program there? It is just to check that libraries are accesible from computing nodes.
Regards,
Ivan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Ivan,
There is no problem during running of sample mpi codes in compute nodes.I think the something else
Regards,
Somanath Moharana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Ivan,
The problem is solved.The erreor was coming due to incompatible libraries of the Model which was initially configured for IBM machine.
Thanks for ur help and support
Regards,
Somanath Moharana

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »