I am connecting to ssh devcloud to test MPI application developed using VS code.
However when trying to run the code using one core it is working fine. however when I try to increase the number of cores it is giving me the below error.
Abort(2663823) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIDU_Init_shm_alloc(151).....: unable to allocate shared memory
Used Command :
mpirun -np 2 ./app_name
mpirun -np 4 ./app_name
Thank you for posting in Intel communities.
>>In DevCloud, we can run a maximum of 2 nodes, and on each node we can run 2 processes (total of 4 processes).
Please follow the below steps:
Connect to the DevCloud via Cygwin using below command:
qsub -I -l nodes=<number_of_nodes>:<property>:ppn=2 -d .
example: qsub -I -l nodes=2:gpu:ppn=2 -d .
After logging into the compute node, we need to get the node numbers which we accessed. So run the below command.
echo $PBS_NODEFILE (example output looks like this: /var/spool/torque/aux//1955007.v-qsvr-1.aidevcloud)
We need to cat the output of $PBS_NODEFILE
cat /var/spool/torque/aux//1955007.v-qsvr-1.aidevcloud s001-n141 s001-n141 s001-n157 s001-n157
Copy the node numbers from above and paste them into the host file (I pasted the above node numbers into host.txt)
After pasting the node numbers into the host file, we can run the mpirun command. (Since I am running the mpi4py script, I gave the python command in the below command.)
mpirun -n 4 -hostfile host.txt python hello.py
If this resolves your issue, make sure to accept this as a solution. This would help others with similar issue. Have a great day ahead.
We assume that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.