Intel® Tiber Developer Cloud
Help connecting to or getting started on Intel® Tiber Developer Cloud

running MPI

black__edgar
Beginner
1,629 Views

Hi all,

I am trying to run an application using MPI. I am testing the MPI sharing memory model. However, the application is not running in the Intel Development cloud.

 

My application name is: parallelSpmvAlphaBeta

The input data is: ../matrices/dc1.mm_bin ../matrices/dc1.in_bin ../matrices/dc1.out_bin

 

I am using the following command:

 

mpiexec -genv I_MPI_PIN_DOMAIN=core -genv I_MPI_FABRICS=shm:ofi -n 4 parallelSpmvAlphaBeta ../matrices/dc1.mm_bin ../matrices/dc1.in_bin ../matrices/dc1.out_bin

 

However, the application do not run. I have to kill it using ctrl-C:

 

user@idc-beta-batch-pvc-node-08:~$ mpiexec -genv I_MPI_PIN_DOMAIN=core -genv I_MPI_FABRICS=shm:ofi -n 4 parallelSpmvAlphaBeta ../matrices/dc1.mm_bin ../matrices/dc1.in_bin ../matrices/dc1.out_bin
^C[mpiexec@idc-beta-batch-pvc-node-08] Sending Ctrl-C to processes as requested
[mpiexec@idc-beta-batch-pvc-node-08] Press Ctrl-C again to force abort
^Cuser@idc-beta-batch-pvc-node-08:~$

 

Any idea why this is not running?

Labels (1)
0 Kudos
8 Replies
Hairul_Intel
Moderator
1,561 Views

Hi black__edgar,

Thank you for reaching out to us.

 

Could you please provide the following additional information so that we can investigate this issue further?

  • Documentation or guide you referred to run the MPI application
  • Instance ID:
  • Reservations details

Start time:

End time:

 

 

Regards,

Hairul

 

0 Kudos
black__edgar
Beginner
1,542 Views

Hi Hairul,

I am not sure where to get all the information you requested.

I am using a terminal within a JupyterLab session.

 

This is the MPI version I am using:

u9252df372877ec2a0108c2790881b7f@idc-beta-batch-pvc-node-01:~$ mpiexec --version
Intel(R) MPI Library for Linux* OS, Version 2021.10 Build 20230619 (id: c2e19c2f3e)
Copyright 2003-2023, Intel Corporation.

 

This simple command is to run my application using 4 processors inside a single node. After issuing the command, nothing happens and I have to break it using Ctrl-C twice:

 

$ mpiexec -np 4 -ppn 4 parallelSpmvSm ../matrices/dc1.mm_bin ../matrices/dc1.in_bin ../matrices/dc1.out_bin
^C[mpiexec@idc-beta-batch-pvc-node-01] Sending Ctrl-C to processes as requested
[mpiexec@idc-beta-batch-pvc-node-01] Press Ctrl-C again to force abort
^Cu9252df372877ec2a0108c2790881b7f@idc-beta-batch-pvc-node-01:~$

 

I have also tried the following without luck:

 

$ mpiexec -genv I_MPI_OFI_PROVIDER=tcp -genv I_MPI_PIN_DOMAIN=core -genv I_MPI_PIN_ORDER=scatter -genv I_MPI_DEBUG=4 -genv I_MPI_FABRICS=shm:ofi -np 4 -ppn 4 parallelSpmvSm ../matrices/dc1.mm_bin ../matrices/dc1.in_bin ../matrices/dc1.out_bin
^C[mpiexec@idc-beta-batch-pvc-node-01] Sending Ctrl-C to processes as requested
[mpiexec@idc-beta-batch-pvc-node-01] Press Ctrl-C again to force abort

 

Even the following helloWorl.c MPI program shows the same behavior.

 

#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    // Print off a hello world message
    printf("Hello world from processor %s, rank %d out of %d processors\n",
           processor_name, world_rank, world_size);

    // Finalize the MPI environment.
    MPI_Finalize();
}

 

 

 

Any ideas?

0 Kudos
Hairul_Intel
Moderator
1,514 Views

Hi black__edgar,

Thank you for sharing the information.

 

We're investigating this issue and will update you on any findings as soon as possible.

 

On another note, were you able to successfully complete the steps in Get Started with the Intel® MPI Library for Linux* OS guide?

 

 

Regards,

Hairul


0 Kudos
black__edgar
Beginner
1,494 Views

Hi Hairul,

Thank you very much for your help.

 

I had not read the guide you mentioned.

After reading it, I realized that I was missing the creation of a host file "hostfile".

I created the "hostfile" file. I am writing to it the node the jupyterlab is assigning me:

 

             u@idc-beta-batch-pvc-node-17:~$ cat hostfile
             idc-beta-batch-pvc-node-17

 

However, my application is not running yet.

 

Thanks again,

Edgar

0 Kudos
Hairul_Intel
Moderator
1,468 Views

Hi black__edgar,

Thank you for the information.

 

For clarification purposes, were you able to run the MPI application in a non-IDC environment?

 

If the error persists, we would suggest for you to post your question in Intel® oneAPI HPC Toolkit since this issue is more related to the MPI Library for better support.

 

 

Regards,

Hairul


0 Kudos
black__edgar
Beginner
1,456 Views

Hi Hairul,

Yes, my application runs in a non-IDC environment.

I will review the Intel® oneAPI HPC Toolkit forum to see if anyone has had this problem.

Thanks,

Edgar Black

0 Kudos
black__edgar
Beginner
1,452 Views

I found the problem.

I needed to add the following parameter "-host localhost,pion-ib" to the mpiexec  process launcher.

 

my command now looks like:

 

mpiexec -genv I_MPI_OFI_PROVIDER=tcp -genv I_MPI_PIN_DOMAIN=core -genv I_MPI_PIN_ORDER=scatter -genv I_MPI_DEBUG=4 -genv I_MPI_FABRICS=shm:ofi -np 4 -ppn 4 -host localhost,pion-ib parallelSpmvAlphaBeta ../matrices/dc1.mm_bin ../matrices/dc1.in_bin ../matrices/dc1.out_bin

The output looks like:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.10 Build 20230619 (id: c2e19c2f3e)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.18.0-impi
[0] MPI startup(): libfabric provider: tcp
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.10.0/etc/tuning_spr_shm-ofi.dat"
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 1358877 idc-beta-batch-pvc-node-06 {0,112}
[0] MPI startup(): 1 1358878 idc-beta-batch-pvc-node-06 {56,168}
[0] MPI startup(): 2 1358879 idc-beta-batch-pvc-node-06 {1,113}
[0] MPI startup(): 3 1358880 idc-beta-batch-pvc-node-06 {57,169}
---> Time taken by 4 processes: 0.299007 seconds, GFLOPS: 6.298500
Solution match in rank 0
Solution match in rank 2
Solution match in rank 1
Solution match in rank 3

0 Kudos
Hairul_Intel
Moderator
1,402 Views

Hi black__edgar,

Glad to know that your issue has been resolved and we appreciate you sharing the solution with the community.

 

This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

 

 

Regards,

Hairul


0 Kudos
Reply