Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Reporting Intel Parallel Studio XE 2017 error

Honghui_S_
Beginner
597 Views

This week I test Intel Parallel Studio XE 2017  for our fortran code, the compiling is fine, but when running, I get an error seems coming from scalapack: 

  | Stacksize not measured: no C compiler
  | Checking for scalapack...
  | Testing pdtran()...
Fatal error in MPI_Sendrecv: Other MPI error, error stack:
MPI_Sendrecv(259)...............: MPI_Sendrecv(sbuf=0x7f4d5e321080, scount=112896, MPI_DOUBLE, dest=0, stag=0, rbuf=0x7f4d5e3fd880, rcount=112896, MPI_DOUBLE, src=0, rtag=0, comm=0x84000007, status=0x7ffc1fa6b8d0) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error!  cannot read from remote process
Fatal error in MPI_Sendrecv: Other MPI error, error stack:
MPI_Sendrecv(259)...............: MPI_Sendrecv(sbuf=0x6ff3100, scount=112896, MPI_DOUBLE, dest=1, stag=0, rbuf=0x70cf900, rcount=112896, MPI_DOUBLE, src=1, rtag=0, comm=0xc4000003, status=0x7ffeb86c71d0) failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error!  cannot read from remote process

 

Any idea about this ? 

Hui

 

0 Kudos
7 Replies
Gennady_F_Intel
Moderator
597 Views

Hui, we know another issue with pdtran from MKL 2017. Could you get us the reproducer this problem? We will investigate the cause of the problem and will keep you updated. regards, Gennady

0 Kudos
Honghui_S_
Beginner
597 Views

Hi Gennady,  I have attached all my test code ( test_pdtran.f90) as well as run log in the zip folder. 

It seems to be a mpi problem but not a pdtran subrutine. Because by removing the following splitting code in my  test_pdtran.f90 code,  intel_2017 works fine:

==============================================================

    split_row = myid / nsplit

    split_col = mod(myid, nsplit)

    call mpi_comm_split(pdtran_blacs_ctxt, split_row, split_col,  &
                        pdtran_comm, mpierr)

    call mpi_comm_rank(pdtran_comm, pdtran_myid,  mpierr)
    call mpi_comm_size(pdtran_comm, pdtran_nprocs,mpierr)

==============================================================

But if I use nsplit = 2, 3 ,4 ,5 ,6 ... then the above mpi error appeared.

I am looking forward to you reply and I would like to use intel_2017 as soon as possible. 

best, 

Hui

 

 

 

 

0 Kudos
Honghui_S_
Beginner
597 Views

it should be noted that if I use intel-composer_xe_2015 : 

(1) compile:
/opt/intel//impi/5.0.1.035/intel64/bin/mpiifort  test_pdtran.f90  -L/opt/intel/mkl/lib/intel64/ -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm

(2) mpirun:
/opt/intel//impi/5.0.1.035/intel64/bin/mpirun -np 14 ./a.out

Then my code in the attachment of my second post passed. 

However if I use  intel-composer_xe_2017 : 

1) compile:
/opt/intel_2017//compilers_and_libraries_2017.0.098/linux/mpi/intel64/bin/mpiifort  test_pdtran.f90  -L/opt/intel_2017/mkl/lib/intel64/ -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread -lm

(2) run
/opt/intel_2017//compilers_and_libraries_2017.0.098/linux/mpi/intel64/bin/mpirun  -np 14 ./a.out

I got the error showed in my first post. 

 

 

0 Kudos
Mengze_W_
Beginner
597 Views

Hi Honghui,

I also met the same problem when I ran the code using MPI_FILE_READ_ALL. (Intel XE 2017)

The error message is like this:

Fatal error in PMPI_Waitall: Other MPI error, error stack:
PMPI_Waitall(405)...............: MPI_Waitall(count=1, req_array=0x15ca768, status_array=0x15ca818) failed
MPIR_Waitall_impl(221)..........: fail failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error!  cannot read from remote process

My colleague said he could run the same code with Intel 2015

Have you solved your problem yet?

 

0 Kudos
Honghui_S_
Beginner
597 Views

Hi Mengze,

Intel 2015 also works for me and  now I am using intel 2015 to compile my code. 

I have uploaded my test code above, and Gennady Fedorov (Intel) said they will try to deal with it. 

best wishes,

Honghui

 

0 Kudos
Hiroshi_S_
Beginner
597 Views

I experienced exactly the same error on Ubuntu 14 and 16. The workaround was;

echo 0 > /proc/sys/kernel/yama/ptrace_scope

If this works, you may want to change the corresponding setting in /etc/sysctl/10-ptrace.conf to make this change permanent.

The release notes say we need to do this when attaching gdb to impi process, but that might not be correct. We always need to do this to use impi.

Cheers,

Hiroshi

0 Kudos
alice_c_
Beginner
597 Views

hello Honghui,S

I also met the same problem when I ran the code using MPI_FILE_READ_ALL. (Intel XE 2017)

Have you solved your problem yet?

0 Kudos
Reply