- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Dear all,
I am compiling different codes (details at the end) using the Intel Cluster Studio 2013 for Linux (C and Fortran compilers, MKL BLACS and MKL FFT3W) + Intel MPI 4.0.3.008. The programs run without problems when using one computing node, but they crash when I try to use more than one computing node.
I have gathered all the possible information from the execution and MPI calls with these options of mpirun: -v -check_mpi -genv I_MPI_DEBUG 5. The resulting information is in the attached files.
The interesting information is at the end of the files, where you can find:
from vasp.log:
[23] ERROR: LOCAL:EXIT:SIGNAL: fatal error
[23] ERROR: Fatal signal 11 (SIGSEGV) raised.
[23] ERROR: Signal was encountered at:
[23] ERROR: hamil_mp_hamiltmu_ (/home/ivasan/programas/VASP/vasp.5.3_test/vasp)
[23] ERROR: After leaving:
[23] ERROR: mpi_allreduce_(*sendbuf=0x7fff5d1ce340, *recvbuf=0x18e19c0, count=1, datatype=MPI_DOUBLE_PRECISION, op=MPI_SUM, comm=0xffffffffc4060000 CART_SUB CART_CREATE CART_SUB CART_CREATE COMM_WORLD [18:23], *ierr=0x7fff5d1ce2ac->MPI_SUCCESS)
from abinit.log:
[23] ERROR: LOCAL:MPI:CALL_FAILED: error
[23] ERROR: Null communicator.
[23] ERROR: Error occurred at:
[23] ERROR: mpi_comm_rank_(comm=MPI_COMM_NULL, *rank=0x29319b8, *ierr=0x7fff83fabb74)
[23] ERROR: initmpi_grid_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/51_manage_mpi/initmpi_grid.F90:178)
[23] ERROR: invars1_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/invars1.F90:1015)
[23] ERROR: invars1m_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/invars1m.F90:186)
[23] ERROR: m_ab6_invars_mp_ab6_invars_load_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/m_ab6_invars_f90.F90:548)
[23] ERROR: MAIN__ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/98_main/abinit.F90:260)
[23] ERROR: main (/home/ivasan/programas/abinit/abinit-6.12.3b/bin/abinit)
[23] ERROR: (/lib64/libc-2.5.so)
[23] ERROR: (/home/ivasan/programas/abinit/abinit-6.12.3b/bin/abinit)
So in both cases the problems seem to be related to MPI.
What can I do to solve these errors?
Thanks in advance for your help.
Iván
CODES:
- VASP V5.3.2 (http://www.vasp.at/). I posted this issue at the support forum: http://cms.mpi.univie.ac.at/vasp-forum/forum_viewtopic.php?3.12037
- Abinit V6.12.3 (http://www.abinit.org/). I posted this issue at the support forum: http://forum.abinit.org/viewtopic.php?f=3&t=1851
Lien copié
- « Précédent
-
- 1
- 2
- Suivant »
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Dear Somanath,
Just one quick test: can you login to one of the computing nodes an execute the program there? It is just to check that libraries are accesible from computing nodes.
Regards,
Ivan
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Dear Ivan,
There is no problem during running of sample mpi codes in compute nodes.I think the something else
Regards,
Somanath Moharana
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Dear Ivan,
The problem is solved.The erreor was coming due to incompatible libraries of the Model which was initially configured for IBM machine.
Thanks for ur help and support
Regards,
Somanath Moharana

- S'abonner au fil RSS
- Marquer le sujet comme nouveau
- Marquer le sujet comme lu
- Placer ce Sujet en tête de liste pour l'utilisateur actuel
- Marquer
- S'abonner
- Page imprimable
- « Précédent
-
- 1
- 2
- Suivant »