- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
My apologies: I posted this earlier within another thread, but afterwards decided to submit it as a new query.
I have been struggling for a couple days to figure out the very basic setting of how to correctly instantiate Intel MPI 2019 for use over sockets/TCP. I am able to source mpivars.sh without any parameters and then export FI_PROVIDER=sockets which allows me to compile and run the simple hello world code found all over the place on a single node with n number ranks. However, when I instantiate my environment in the same way and try to compile PAPI from source, it complains in the configure step that the C compiler (GCC in this case) is not able to create executables. The config.log reveals that it struggles to find libfabric.so.1. Even if I add the libfabrics directory to my LD_LIBRARY_PATH and link to the libfabrics library, I am not able to build PAPI from source. Additionally, I cannot find good documentation for how to use MPI in the most simple and basic way - single node and several processes. There is a graphic on several presentations and even software.intel.com/intel-mpi-library which indicates I will be able to choose TCP/IP, among other fabric options, at runtime. I will appreciate your comments and assistance in letting me know the correct way to do this.
Regards,
-Rashawn
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
What 2019 update are you using? Can you find libfabric.so.1 in the libfabric directory?
--
Best regards, Anatoliy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Anatoliy,
Thank you for your prompt reply. I am using 2019 update 4 from Parallel Studio Cluster Edition; libfabric.so.1 is in <pathToInstall>/compilers_and_libraries_2019.0.243/linux/mpi/intel64/libfabrics/libfabric.so.1 as a link to libfabric.so.
Regards,
-Rashawn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Anatoliy,
I have not heard back on this. Do you have a recommendation on what I should do?
Regards,
-Rashawn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Rashawn.
If you're using gcc anyway, you might prefer to build third-party binaries without sourcing compilervars.sh, and then
source mpivars.sh
This script is available in intel64/bin directory inside Parallel Studio installation.
After that, you'll be able to check correct libraries are pulled with ldd <your_binary>.
Unless you're running under some scheduler,
mpiexec.hydra -n <amount_of_processes> <your_binary>
will start them up locally by default. If this doesn't happen, provide output of a run with -v flag added, please.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Maksim,
I am completely confused. I want to know the correct arguments to pass to mpivars.sh so that I can execute a simple hello world MPI application over ethernet, not a fancier fabric. This seems to be extremely difficult to ascertain. I have not been successful in being able to do this despite playing around with the input parameters for mpivars.sh: I have turned -ofi_internal on (1) and turned it off (0), and I have supplied debug and release as the kind; yet I have not found the correct incantation.
As stated in my original query, I am not using compilervars.sh because I am not using Intel compiler suite. I want to use Intel MPI with a non Intel compiler which is perfectly reasonable.
I do know where the instantiation scripts exist for both mpivars and compilervars.
In my professional capacity at Intel, I develop and validate software. For the task I intended to complete last week, I need to verify that it is possible to build a particular tool, PAPI, with GCC and also inform PAPI about the version of MPI I intend to use for the enclosed MPI test applications. I was able to do something very similar in earlier releases of Intel MPI. I am simply starting from the most easy case which is the ability to do this on a single node, meaning without a special fabric, and only requiring the need to execute several MPI ranks on a single node. I really just need to know what the correct arguments are to hand to mpivars and where these arguments are documented for MPI 2019. I ran what you have suggested. In what I post here, I demonstrate I am unable to execute the hello world MPI application:
#output of ldd <binary>
ldd hello_mpi.ofi-internal-debugmt
linux-vdso.so.1 (0x00007ffedc1f5000)
libmpifort.so.12 => /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f36194d0000)
libmpi.so.12 => /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/debug_mt/libmpi.so.12 (0x00007f36178e5000)
librt.so.1 => /lib64/librt.so.1 (0x00007f36176dd000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f36174bf000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f36172bb000)
libc.so.6 => /lib64/libc.so.6 (0x00007f3616f01000)
libgcc_s.so.1 => /nfs/site/proj/coralhpctools/builds/compilers/gcc/gcc-9.1.0/skx/lib64/libgcc_s.so.1 (0x00007f3616ce9000)
libfabric.so.1 => /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib/libfabric.so.1 (0x00007f3616ab1000)
/lib64/ld-linux-x86-64.so.2 (0x00007f361988f000)
##
##output of mpiexec.hydra -v:
##
mpiexec.hydra -v -n 1 ./hello_mpi.ofi-internal-debugmt
[mpiexec@anchpcskx1001] Launch arguments: /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin//hydra_bstrap_proxy --upstream-host anchpcskx1001 --upstream-port 34685 --pgid 0 --launcher ssh --launcher-number 0 --base-path /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/ --tree-width 16 --tree-level 1 --time-left -1 --collective-launch 1 --debug --proxy-id 0 --node-id 0 --subtree-size 1 --upstream-fd 7 /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin//hydra_pmi_proxy --usize -1 --auto-cleanup 1 --abort-signal 9
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:0@anchpcskx1001] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=get_maxes
[proxy:0:0@anchpcskx1001] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=4096
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=get_appnum
[proxy:0:0@anchpcskx1001] PMI response: cmd=appnum appnum=0
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=get_my_kvsname
[proxy:0:0@anchpcskx1001] PMI response: cmd=my_kvsname kvsname=kvs_67947_0
Abort(1094799) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(666)......:
MPID_Init(922).............:
MPIDI_NM_mpi_init_hook(719): OFI addrinfo() failed (ofi_init.h:719:MPIDI_NM_mpi_init_hook:No data available)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 67964 RUNNING AT anchpcskx1001
= EXIT STATUS: 143
===================================================================================
##
## Output of mpirun -v
##
> mpirun -v -n 1 ./hello_mpi.ofi-internal-debugmt
[mpiexec@anchpcskx1001] Launch arguments: /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin//hydra_bstrap_proxy --upstream-host anchpcskx1001 --upstream-port 35059 --pgid 0 --launcher ssh --launcher-number 0 --base-path /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/ --tree-width 16 --tree-level 1 --time-left -1 --collective-launch 1 --debug --proxy-id 0 --node-id 0 --subtree-size 1 --upstream-fd 7 /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin//hydra_pmi_proxy --usize -1 --auto-cleanup 1 --abort-signal 9
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:0@anchpcskx1001] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=get_maxes
[proxy:0:0@anchpcskx1001] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=4096
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=get_appnum
[proxy:0:0@anchpcskx1001] PMI response: cmd=appnum appnum=0
[proxy:0:0@anchpcskx1001] pmi cmd from fd 6: cmd=get_my_kvsname
[proxy:0:0@anchpcskx1001] PMI response: cmd=my_kvsname kvsname=kvs_67978_0
Abort(1094799) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(666)......:
MPID_Init(922).............:
MPIDI_NM_mpi_init_hook(719): OFI addrinfo() failed (ofi_init.h:719:MPIDI_NM_mpi_init_hook:No data available)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 67982 RUNNING AT anchpcskx1001
= EXIT STATUS: 143
===================================================================================
##Contents of hello_mpi.c:
> cat hello_mpi.c
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d"
" out of %d processors\n",
processor_name, world_rank, world_size);
// Finalize the MPI environment.
MPI_Finalize();
}
Your assistance will be much appreciated.
Best regards,
-Rashawn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
UPDATE. I have been successful in compiling both the MPI hello world program and PAPI using GCC 9.1.0 as the compiler suite (C, C++, and Fortran) and Intel 2019, update 4, MPI. First, I instantiated the GCC environment followed by sourcing the Intel mpivars.sh without arguments. Then I reviewed the environment and noted two variables that were set: 1.) LIBRARY_PATH pointing to the directory containing libfabrics.so and 2.) FI_PROVIDER_PATH pointing to a directory containing the FI providers (sockets, tcp, psmx2, verbs, rxm). The LD_LIBRARY_PATH, PATH, and MANPATH had been updated appropriately. With these settings, one is able to compile an MPI code: mpicc <mpisrc.c> -o <mpiBinary>. But this will not execute on a single node with one or more processes. The complaint is:
mpirun -n 1 ./hello_mpi_mpivars-noargsAbort(1094799) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack: MPIR_Init_thread(666)......: MPID_Init(922).............: MPIDI_NM_mpi_init_hook(719): OFI addrinfo() failed (ofi_init.h:719:MPIDI_NM_mpi_init_hook:No data available) =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 7541 RUNNING AT <hostname> = EXIT STATUS: 143 ===================================================================================
However, when I set FI_PROVIDER=sockets at runtime, I obtain the expected output (it also works with tcp as the provider):
> mpirun -n 4 ./hello_mpi_mpivars-noargs Hello world from processor <hostname>, rank 2 out of 4 processors Hello world from processor <hostname>, rank 3 out of 4 processors Hello world from processor <hostname>, rank 1 out of 4 processors Hello world from processor <hostname>, rank 0 out of 4 processors
I then tackled the compilation of PAPI using the same environment. It compiled successfully, and I was able to successfully the PAPI tests I needed to complete.
I definitely had something amiss in my environment last week when I encountered the error during the PAPI configure step stating that it could not create C executables and the log file indicated libfabric.so was not found.
I am happy with using the steps above for process communication via sockets or tcp fabric providers.
Thank you Anatoliy and Maksim for your helpful responses.
Regards,
-Rashawn
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page