- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear developers,
This is Xingguang Zhou. I use the Intel oneAPI 2023 to compile a computational fluid dynamics program - Incompact3d. After the program compilation is completed, it cannot run, and the error is consistent with: Intel MPI ''helloworld" program not work on the machine - Intel Community
We successfully ran the program in parallel on a single node based on the adminstrator's advice on the above post.
We now want to check and run the program in parallel with the multiple nodes. However, when we try to run the program in parallel with 2 nodes, the program shows the error:
[0] MPI startup(): Intel(R) MPI Library, Version 2021.8 Build 20221129 (id: 339ec755a1)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (149 MB per rank) * (56 local ranks) = 8362 MB total
[56] MPI startup(): shm segment size (149 MB per rank) * (56 local ranks) = 8362 MB total
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/share/apps/inteloneapi2023/mpi/2021.8.0/etc/tuning_icx_shm.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 30 (TAG_UB value: 1073741823)
[0] MPI startup(): source bits available: 0 (Maximal number of rank: 0)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 57 PID 1442 RUNNING AT node6348002
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 58 PID 1443 RUNNING AT node6348002
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================
Our configurations are as below:Operating system:
[3120103311@login02 TGV-Taylor-Green-vortex]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
Node:
Each node contains 56 CPUs and 256 GB memory.
Submit script:
#!/bin/bash
#SBATCH -J Incompact3d
#SBATCH -p node6348
#SBATCH -N 2
#SBATCH -n 112
module load openmpi4
source /share/apps/inteloneapi2023/setvars.sh
ulimit -s unlimited
ulimit -l unlimited
I_MPI_DEBUG=10 mpirun -genv I_MPI_FABRICS=shm -genv FI_PROVIDER=shm -np 112 ./xcompact3d > logfile
Any suggestions are appreciate.
Yours,
Xingguang Zhou
Xi'an, China.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear @Xingguang-Zhou
on the forum I only can help you with problems you encounter using the latest release of Intel MPI 2021.12 which is contained in oneAPI 2024.1.
Killed by signal 9 can have multiple reasons, you need to dig deeper what's causing it.
Note: we do not support CentOS 7 anymore.
Since you are using Slurm you may refer to
https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-12/job-schedulers-support.html
I would start with a very simple MPI precompiled benchmark and run it without slurm and see if that works:
mpirun -n 2 IMB-MPI1

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page