- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wanted to test coarrays with oneAPI 2022 (and same problem with 2023)
When I try to run, mpiexec crunches because it can't find the config file. The reported filename has weird extra characters (same every time).
$ ifx -coarray=distributed -coarray-config-file=$HOME/caf.cfg demo.f90
$ ./a.out
[mpiexec@xeonmax] HYD_parse_configfile (../../../../../src/pm/i_hydra/mpiexec/mpiexec_params.c:1110): unable to open config file: /gpfs/home/arcurtis/caf.cfgintr#int#4
[mpiexec@xeonmax] parse_compound_configfile (../../../../../src/pm/i_hydra/mpiexec/mpiexec_params.c:1219): error parsing config file /gpfs/home/arcurtis/caf.cfgintr#int#4
[mpiexec@xeonmax] mpiexec_get_parameters (../../../../../src/pm/i_hydra/mpiexec/mpiexec_params.c:1482): no executable specified
[mpiexec@xeonmax] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1783): error parsing parameters
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we fixed the config-file problem in our development version and the fix will be included in the next release.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
thanks for reporting this, will get that bug fixed.
in the meantime, you can use the environment variable to overwrite the file name:
FOR_COARRAY_CONFIG_FILE=$HOME/caf.cfg ./a.out
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the workaround. Here's what happens now with a completely empty program. "-n 1 ./a.out" in the config file. Ditto for "single" coarray mode.
$ ifx -coarray=shared -coarray-config-file=$HOME/caf.cfg empty.f90
$ FOR_COARRAY_CONFIG_FILE=$HOME/caf.cfg ./a.out
[xm002:422794:0:422794] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x39)
==== backtrace (tid: 422794) ====
0 0x0000000000054df0 __GI___sigaction() :0
1 0x000000000002efd8 uct_ib_mlx5_devx_md_get_pdn() ???:0
2 0x0000000000032202 uct_ib_mlx5_devx_create_cq() ???:0
3 0x000000000001cc86 uct_ib_md_close() ???:0
4 0x000000000001ce11 uct_ib_md_close() ???:0
5 0x00000000000650d3 ucs_rcache_create_region() ???:0
6 0x000000000001d5f6 uct_ib_iface_estimate_perf() ???:0
7 0x0000000000037f3f ucp_request_purge_enqueue_cb() ???:0
8 0x000000000003fe68 ucp_memh_get_slow() ???:0
9 0x00000000000401b8 ucp_memh_get_slow() ???:0
10 0x000000000004129b ucp_mem_map() ???:0
11 0x00000000000094dd mlx_mr_regattr() osd.c:0
12 0x00000000000095b8 mlx_mr_regv() osd.c:0
13 0x00000000000095fa mlx_mr_reg() osd.c:0
14 0x000000000063299a fi_mr_reg() /p/pdsd/scratch/Uploads/IMPI/other/software/libfabric/linux/v1.9.0/include/rdma/fi_domain.h:313
15 0x000000000063299a win_allgather() /build/impi/_buildspace/release/../../src/mpid/ch4/netmod/ofi/ofi_win.c:206
16 0x0000000000247467 MPIDIG_mpi_win_create() /build/impi/_buildspace/release/../../src/mpid/ch4/src/ch4r_win.c:821
17 0x0000000000265a22 MPID_Win_create() /build/impi/_buildspace/release/../../src/mpid/ch4/src/ch4_win.c:109
18 0x000000000079ca0c PMPI_Win_create() /build/impi/_buildspace/release/../../src/mpi/rma/win_create.c:173
19 0x00000000000174e2 for_rtl_ICAF_INIT() ???:0
20 0x0000000000407c34 for_rtl_init_() ???:0
21 0x00000000004048cd main() ???:0
22 0x000000000003feb0 __libc_start_call_main() ???:0
23 0x000000000003ff60 __libc_start_main_alias_2() :0
24 0x00000000004047d5 _start() ???:0
=================================
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 422794 RUNNING AT xm002
= KILLED BY SIGNAL: 11 (Segmentation fault)
===================================================================================
i
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you verify that Intel MPI is working properly on your setup?
Can you please add '-genv I_MPI_DEBUG=10' to the config file and post the result?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A simple hello-world type MPI program runs fine.
If I run the executable directly there's no MPI debugging output. If I run it explicitly via mpirun, I get the below before the seg fault messages:
[0] MPI startup(): Intel(R) MPI Library, Version 2021.6 Build 20220227 (id: 28877f3f32)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): File "/gpfs/software/intel/oneAPI/2022_2/mpi/2021.6.0/etc/tuning_icx_shm-ofi_mlx_400.dat" not found
[0] MPI startup(): Load tuning file: "/gpfs/software/intel/oneAPI/2022_2/mpi/2021.6.0/etc/tuning_icx_shm-ofi_mlx.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 20 (TAG_UB value: 1048575)
[0] MPI startup(): source bits available: 21 (Maximal number of rank: 2097151)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 754023 xm001 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,8
3,84,85,86,87,88,89,90,91,92,93,94,95}
[0] MPI startup(): I_MPI_ROOT=/gpfs/software/intel/oneAPI/2022_2/mpi/2021.6.0
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_HYDRA_BOOTSTRAP=slurm
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=10
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There have been some issues with Intel MPI which should have been fixed with 2023.2, can you please update the Intel MPI also?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Waiting for new version to be installed. Will check back. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
any news from your side?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nothing yet. Apparently there were some issues with MPI in 2023.2 here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, I will wait until end of the week, if they do not manage to install the updated MPI, please open a new thread if the issue remains with Intel MPI 2021.10.
Your compute center should consider opening a priority support ticket to resolve the Intel MPI upgrade issue or you may post a problem description in the Intel HPC toolkit forum:
https://community.intel.com/t5/Intel-oneAPI-HPC-Toolkit/bd-p/oneapi-hpc-toolkit
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we fixed the config-file problem in our development version and the fix will be included in the next release.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page