Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
7234 Discussions

Puzzling (but maybe elementary!) problem calling SCALAPACK PZGETF2 routine

daren__wall
Beginner
1,333 Views

Dear All,

            I have a somewhat strange runtime problem when calling the SCALAPACK PZGETF2 routine.

I have constructed a minimal code that reproduces the problem, which is attached below. The code compiles, and runs successfully for a single process, but fails for two processes

at runtime during the call to PZGETF2, but without returning (thus there is no INFO number etc.).

The error that is returned begins:

Assertion failed in file ../../src/util/intel/shm_heap/impi_shm_heap.c at line 877: offset < heap.shm_size
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(MPL_backtrace_show+0x34) [0x7f88105341d4]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(MPIR_Assert_fail+0x21) [0x7f880fcbc031]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(+0x449f93) [0x7f880fffaf93]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(+0x28563d) [0x7f880fe3663d]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(+0x153487) [0x7f880fd04487]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(+0x199bda) [0x7f880fd4abda]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(+0x180069) [0x7f880fd31069]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/intel64/lib/release/libmpi.so.12(+0x16bd8e) [0x7f880fd1cd8e]
/opt/intel/compilers_and_libraries_2020.0.166/linux/mpi/inte
Abort(1) on node 0: Internal error

I am running the code on Linux Mint and compile the code by:

mpiifort -o pz.exe pz_factorize.f90 -mkl=parallel -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -liomp5 -lpthread -ldl

 
Selecting the values NPROW = NPCOL=1, MB = 10 the code will run using a single process , with execution by:
 
mpiexec.hydra   -n  1 ./pz.exe
 

Selecting instead the values NPROW=1 NPCOL=2, MB=5 the code will instead fail with the above error

if run using:

 mpiexec.hydra   -n  2 ./pz.exe

 

Can any of you knowledgeable fortran gurus see where I am going wrong ?!

 I am very grateful for any assistance,

                                Thanks, Dan.

0 Kudos
2 Replies
Khang_N_Intel
Employee
1,113 Views

Tested the code with oneMKL 2021.2. Encountered error. Escalated!


0 Kudos
Khang_N_Intel
Employee
1,090 Views

The error was due to the shared memory transfer.

oneMKL and Intel MPI do not support Linux Mint.


oneMKL system requirements: https://software.intel.com/content/www/us/en/develop/articles/oneapi-math-kernel-library-system-requirements.html

For C/C++ and Fortran

Linux*

  • Amazon* Linux 2
  • CentOS* (latest version)
  • Clear Linux*
  • Debian* (latest version)
  • Wind River* Linux (latest version)
  • Yocto 2.7
  • Fedora* 31
  • openSUSE* 15
  • Redhat Enterprise Linux (RHEL)* 7, 8
  • SUSE Linux Enterprise Server* (SLES) 12, 15
  • Ubuntu* 18.04 LTS, 20.04 LTS

 

Intel MPI system requirements: https://software.intel.com/content/www/us/en/develop/articles/intel-mpi-library-release-notes-linux.html

Software Requirements

(installation issues may occur with operating systems that are not released at the date of the current Intel MPI Library release)

  • Operating systems:
    • Red Hat* Enterprise Linux* 7, 8
    • Fedora* 31
    • CentOS* 7, 8
    • SUSE* Linux Enterprise Server* 12, 15
    • Ubuntu* LTS 16.04, 18.04, 20.04
    • Debian* 9, 10
    • Amazon Linux 2


There will be no more discussion about this issue.


0 Kudos
Reply