- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This week I test Intel Parallel Studio XE 2017 for our fortran code, the compiling is fine, but when running, I get an error seems coming from scalapack:
| Stacksize not measured: no C compiler
| Checking for scalapack...
| Testing pdtran()...
Fatal error in MPI_Sendrecv: Other MPI error, error stack:
MPI_Sendrecv(259)...............: MPI_Sendrecv(sbuf=0x7f4d5e321080, scount=112896, MPI_DOUBLE, dest=0, stag=0, rbuf=0x7f4d5e3fd880, rcount=112896, MPI_DOUBLE, src=0, rtag=0, comm=0x84000007, status=0x7ffc1fa6b8d0) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
Fatal error in MPI_Sendrecv: Other MPI error, error stack:
MPI_Sendrecv(259)...............: MPI_Sendrecv(sbuf=0x6ff3100, scount=112896, MPI_DOUBLE, dest=1, stag=0, rbuf=0x70cf900, rcount=112896, MPI_DOUBLE, src=1, rtag=0, comm=0xc4000003, status=0x7ffeb86c71d0) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
Any idea about this ?
Hui
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hui, we know another issue with pdtran from MKL 2017. Could you get us the reproducer this problem? We will investigate the cause of the problem and will keep you updated. regards, Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gennady, I have attached all my test code ( test_pdtran.f90) as well as run log in the zip folder.
It seems to be a mpi problem but not a pdtran subrutine. Because by removing the following splitting code in my test_pdtran.f90 code, intel_2017 works fine:
==============================================================
split_row = myid / nsplit
split_col = mod(myid, nsplit)
call mpi_comm_split(pdtran_blacs_ctxt, split_row, split_col, &
pdtran_comm, mpierr)
call mpi_comm_rank(pdtran_comm, pdtran_myid, mpierr)
call mpi_comm_size(pdtran_comm, pdtran_nprocs,mpierr)
==============================================================
But if I use nsplit = 2, 3 ,4 ,5 ,6 ... then the above mpi error appeared.
I am looking forward to you reply and I would like to use intel_2017 as soon as possible.
best,
Hui
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
it should be noted that if I use intel-composer_xe_2015 :
(1) compile:
/opt/intel//impi/5.0.1.035/intel64/bin/mpiifort test_pdtran.f90 -L/opt/intel/mkl/lib/intel64/ -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm
(2) mpirun:
/opt/intel//impi/5.0.1.035/intel64/bin/mpirun -np 14 ./a.out
Then my code in the attachment of my second post passed.
However if I use intel-composer_xe_2017 :
1) compile:
/opt/intel_2017//compilers_and_libraries_2017.0.098/linux/mpi/intel64/bin/mpiifort test_pdtran.f90 -L/opt/intel_2017/mkl/lib/intel64/ -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm
(2) run
/opt/intel_2017//compilers_and_libraries_2017.0.098/linux/mpi/intel64/bin/mpirun -np 14 ./a.out
I got the error showed in my first post.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Honghui,
I also met the same problem when I ran the code using MPI_FILE_READ_ALL. (Intel XE 2017)
The error message is like this:
Fatal error in PMPI_Waitall: Other MPI error, error stack:
PMPI_Waitall(405)...............: MPI_Waitall(count=1, req_array=0x15ca768, status_array=0x15ca818) failed
MPIR_Waitall_impl(221)..........: fail failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
My colleague said he could run the same code with Intel 2015
Have you solved your problem yet?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mengze,
Intel 2015 also works for me and now I am using intel 2015 to compile my code.
I have uploaded my test code above, and Gennady Fedorov (Intel) said they will try to deal with it.
best wishes,
Honghui
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I experienced exactly the same error on Ubuntu 14 and 16. The workaround was;
echo 0 > /proc/sys/kernel/yama/ptrace_scope
If this works, you may want to change the corresponding setting in /etc/sysctl/10-ptrace.conf to make this change permanent.
The release notes say we need to do this when attaching gdb to impi process, but that might not be correct. We always need to do this to use impi.
Cheers,
Hiroshi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello Honghui,S
I also met the same problem when I ran the code using MPI_FILE_READ_ALL. (Intel XE 2017)
Have you solved your problem yet?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page