Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
The Intel sign-in experience is changing in February to support enhanced security controls. If you sign in, click here for more information.
1988 Discussions

3rd gen Xeon showed slower performance with intel MPI library

Kuni
New Contributor I
525 Views

Now we are studying network traffic of HPC use. For this, we are using Intel MPI Library (latest - Intel HPC kit at 12/10/2022) and Nas Parallel Benchmark (3.4.2). Before measuring network traffic, I measured the performance without using network traffic.  We used following platform:  

 

machine1. Xeon Silver 4310 server 8ch 64GB RAM, Hyper thread on, CentOS 7.9, Turbo ON

machine 2. Xeon Silver 4214 server 6ch 96GB RAM Hyper thread on CentOS 7.9, no Turbo

machine 3.  4 core 8GB RAM virtual machine on machine 1. CentOS 7.9

machine 4.  4 core 8GB RAM vitual machine on machine 2. CentOS 7.9

 

Results: 

Test 1.  mpirun -n 4 ./bin/bt.B.x (4 process smaller array - 102 x 102 x 102)

machine 1.  49.87 sec

machine 2. 62.02 sec

machine 3. 43.92 sec

machine 4. 63.11 sec

 

Test 2. mpirun -n 4 ./bin/bt.C.x  (4 process larger array - 162 x 162 x 162)

machine 1. 388.57 sec

machine 2. 253.40 sec

machine 3. 201.79 sec

machine 4. 256.78 sec

 

In case of the above test 1, the result was understandable and performance diffrence was not strange and expected results were shown.

 

However, 2nd test. I saw very strange results. There is two unexped things.

1. Newer (3rd) generation of Xeon showed much slower result than older (2nd) generation of Xeon on real machine.

2. Newer (3rd) generation of Xeon showed big improvement , if the benchmark was executed on the virtual machine. 

 

In case of the memory of the machine 1 and the machine 2, machine 2's memory is 1/3 x bigger, however, the using memory of the test 2 (bt.C.x) only consume 4GB (free command result), then it the memory size difference might not make such big effects to execution results. 

 

I also executed the tests with openmpi 4.1 the following is the results:

Test 1.  mpirun -np 4 ./bin/bt.B.x (4 process smaller array)

machine 1.  52.31 sec

machine 2.  61.73 sec

 

Test 2. mpirun -np 4 ./bin/bt.C.x  (4 process larger array)

machine 1. 198.70 sec

machine 2. 252.31 sec

 

Then it seems that Intel MPI and 3rd Gen Xeon and some large array treatment may cause performance down.  Then it seems that I can not use Intel MPI with  3rd Gen Xeon. But Intel MPI is much easier to specify fabric and then I want to use it our network traffic evaluation if possible.  Then, I want to know following things to use Intel MPI library:

 

1. Why 3rd Gen Xeon showed slow performance? Why it was not shown with my vitrual machine case even with 3rd Gen Xeon?  

2. Why the performance down is shown with Intel MPI library?

3. Is there any way to make performance up with Intel MPI and 3rd Gen Xeon? 

 

Please help!.

 

K. Kunita

0 Kudos
16 Replies
SantoshY_Intel
Moderator
507 Views

Hi,

 

Thanks for posting in the Intel forums.

 

Could you please provide us with the following details which would help us in further investigation of your issue?

  1. What is the job scheduler you are using?
  2. What is the FI_PROVIDER(mlx/psm2/verbs etc..) you are using?
  3. What is the Interconnect hardware(Infiniband/Intel Omni-Path etc..) you are using?
  4. What is the Intel MPI version you are using?
  5. Also, please provide us the sample reproducer code to reproduce the issue from our end.

 

Thanks & Regards,

Santosh

 


Kuni
New Contributor I
497 Views

mpiI did not use job scheduler. I just run command "mpirun -np 4 ./bin/bt.C.x" or "mpirun -n 4 ./bin/bt.B.x". 

At this time, issue happend without any node to node communication. I just use one server. Then communication might be loop back socket or Shared memory base.  The FI_PROVIDER might not make effect.  For the reference, I used command "mpirun -n 4 -genv FI_PROVIDER tcp ./bin/bt.X.x" (X is C or B).  The result is same. 

 

mpirun version is latest intel hpc-kit 

$ mpirun --version
Intel(R) MPI Library for Linux* OS, Version 2021.7 Build 20221022 (id: f7b29a2495)
Copyright 2003-2022, Intel Corporation.

 

To reproduce your side,  follwing procedure can be used:

 

On CentOS 7.9,

# su -

# yum update

- install intel-basekit, intel-hpckit based on intel instruction. Basically, set repository for one api and then

# yum install intel-basekit

# yum install intel-hpckit

# exit

- downlaod nas 4.3.2 software 

$ .  /opt/intel/oneapi/setvars.sh

$ wget https://www.nas.nasa.gov/assets/npb/NPB3.4.2.tar.gz

$ sudo yum install centos-release-scl

$ sudo yum install devtoolset-9

$ scl enable devtoolset-9 bash

$ cd npb/NPB3.4.2/NPB3.4-MPI

$ cp config/make.def.template config/make.def

$ cp config/suite.def.template config/suite.def

$ vim config/make.def

change to the followings:

MPIFC = /opt/intel/oneapi/mpi/latest/bin/mpif90
FMPI_LIB = -L/opt/intel/oneapi/mpi/latest/lib -lmpi
FMPI_INC = -I/opt/intel/oneapi/mpi/latest/include
MPICC = /opt/intel/oneapi/mpi/latest/bin/mpicc
CMPI_LIB = -L/opt/intel/oneapi/mpi/latest/lib -lmpi
CMPI_INC = -I/opt/intel/oneapi/mpi/latest/include

$ vim config/suite.def

delete all non comment lines and add following

bt<tab>B

bt<tab>C

$ make suite

$ mpirun -n 4 ./bin/bt.B.x

$ mpirun -n 4 ./bin/bt.C.x

 

For Virtual machine, you can create virtual machine, with OS standard way. 

 

 

 

SantoshY_Intel
Moderator
479 Views

Hi,

 

Thanks for providing all the requested details.

 

Could you please provide the outputs for the below commands after initializing the Intel oneAPI environment:

 

fi_info -l
ibv_devinfo
lspci | grep Mellanox
lspci | grep Omni-Path

 

 

Also, please provide the complete debug log for the command below:

 

I_MPI_DEBUG=30 mpirun -n 4 ./bin/bt.B.x

 

 

Thanks & Regards,

Santosh

 

 

 

Thanks & Regards,

Santosh

 

Kuni
New Contributor I
436 Views

Hi Santosh,

 

Thank you for your quick response. 

The followings are what you requested: 

 

[kkunita@svr4 NPB3.4-MPI]$ fi_info -l
psm2:
version: 113.20
mlx:
version: 1.4
psm3:
version: 1103.0
psm3:
version: 1102.0
ofi_rxm:
version: 113.20
verbs:
version: 113.20
verbs:
version: 113.20
tcp:
version: 113.20
sockets:
version: 113.20
shm:
version: 114.0
ofi_hook_noop:
version: 113.20
[kkunita@svr4 NPB3.4-MPI]$ ibv_devinfo
hca_id: rdmap24s0f0
transport: InfiniBand (0)
fw_ver: 1.60
node_guid: 669d:99ff:feff:ff5e
sys_image_guid: 649d:99ff:ff5e:0000
vendor_id: 0x8086
vendor_part_id: 5522
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet

hca_id: irdma1
transport: InfiniBand (0)
fw_ver: 1.60
node_guid: 669d:99ff:feff:ff5f
sys_image_guid: 649d:99ff:ff5f:0000
vendor_id: 0x8086
vendor_part_id: 5522
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_DOWN (1)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet

[kkunita@svr4 NPB3.4-MPI]$ lspci |grep Mellanox
[kkunita@svr4 NPB3.4-MPI]$ lspci |grep Omni_Path
[kkunita@svr4 NPB3.4-MPI]$ I_MPI_DEBUG=30 mpirun -n 4 ./bin/bt.B.x
[0] MPI startup(): Intel(R) MPI Library, Version 2021.7 Build 20221022 (id: f7b29a2495)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (342 MB per rank) * (4 local ranks) = 1368 MB total
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
libfabric:24286:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:24286:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:24286:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:24286:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: verbs (113.20)
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: verbs (113.20)
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: tcp (113.20)
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: sockets (113.20)
libfabric:24286:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:24286:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:24286:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ZE not supported
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: shm (114.0)
libfabric:24286:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:24286:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:24286:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:24286:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: ofi_rxm (113.20)
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: psm2 (113.20)
libfabric:24286:psm3:core:fi_prov_ini():752<info> build options: VERSION=1102.0=11.2.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: psm3 (1102.0)
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: mlx (1.4)
libfabric:24286:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:24286:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:24286:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ZE not supported
libfabric:24286:psm3:core:fi_prov_ini():785<info> build options: VERSION=1103.0=11.3.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: psm3 (1103.0)
libfabric:24286:core:core:ofi_register_provider():474<info> registering provider: ofi_hook_noop (113.20)
libfabric:24286:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:24286:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:24286:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:24286:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:24286:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:24286:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:24286:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:24286:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:24286:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:24286:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:24286:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:24286:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:24286:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:24286:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:24286:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:24286:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:24286:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:24286:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:24286:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
[0] MPI startup(): max_ch4_vnis: 1, max_reg_eps 64, enable_sep 0, enable_shared_ctxs 0, do_av_insert 0
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
libfabric:24286:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:24286:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
[0] MPI startup(): libfabric provider: psm3
[0] MPI startup(): detected psm3 provider, set device name to "psm3"
libfabric:24286:core:core:fi_fabric_():1423<info> Opened fabric: RoCE-192.168.17.0/24
libfabric:24286:core:core:ofi_shm_map():171<warn> shm_open failed
libfabric:24286:core:core:ofi_ns_add_local_name():370<warn> Cannot add local name - name server uninitialized
[0] MPI startup(): addrnamelen: 32
[0] MPI startup(): File "/opt/intel/oneapi/mpi/2021.7.1/etc/tuning_icx_shm-ofi_psm3_100.dat" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.7.1/etc/tuning_icx_shm-ofi_psm3.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): threading: num_pools: 1
[0] MPI startup(): threading: enable_sep: 0
[0] MPI startup(): threading: direct_recv: 1
[0] MPI startup(): threading: zero_op_flags: 0
[0] MPI startup(): threading: num_am_buffers: 1
[0] MPI startup(): tag bits available: 30 (TAG_UB value: 1073741823)
[0] MPI startup(): source bits available: 30 (Maximal number of rank: 1073741823)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 24286 svr4 {0,1,2,12,13,14}
[0] MPI startup(): 1 24287 svr4 {3,4,5,15,16,17}
[0] MPI startup(): 2 24288 svr4 {6,7,8,18,19,20}
[0] MPI startup(): 3 24289 svr4 {9,10,11,21,22,23}
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.7.1
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=30
[0] allocate handle (kind=1, size=744, direct_size=8, indirect_size=1) ptr=0x7f2002efe740
[0] allocate handle (kind=2, size=40, direct_size=8, indirect_size=1) ptr=0x7f100004c440


NAS Parallel Benchmarks 3.4 -- BT Benchmark

No input file inputbt.data. Using compiled defaults
Size: 102x 102x 102 (class B)
Iterations: 200 dt: 0.0003000
Total number of processes: 4

Time step 1
Time step 20
Time step 40
Time step 60
Time step 80
Time step 100
Time step 120
Time step 140
Time step 160
Time step 180
Time step 200
Verification being performed for class B
accuracy setting for epsilon = 0.1000000000000E-07
Comparison of RMS-norms of residual
1 0.1423359722929E+04 0.1423359722929E+04 0.1070287152945E-13
2 0.9933052259015E+02 0.9933052259015E+02 0.7153317200312E-15
3 0.3564602564454E+03 0.3564602564454E+03 0.5900255245348E-14
4 0.3248544795908E+03 0.3248544795908E+03 0.9798945854817E-14
5 0.3270754125466E+04 0.3270754125466E+04 0.1223502756335E-13
Comparison of RMS-norms of solution error
1 0.5296984714094E+02 0.5296984714094E+02 0.9389868800427E-15
2 0.4463289611567E+01 0.4463289611567E+01 0.1293476388601E-13
3 0.1312257334221E+02 0.1312257334221E+02 0.1258908460682E-13
4 0.1200692532356E+02 0.1200692532356E+02 0.6805440394643E-14
5 0.1245957615104E+03 0.1245957615104E+03 0.1003690013030E-13
Verification Successful


BT Benchmark Completed.
Class = B
Size = 102x 102x 102
Iterations = 200
Time in seconds = 52.59
Total processes = 4
Active processes= 4
Mop/s total = 13350.83
Mop/s/process = 3337.71
Operation type = floating point
Verification = SUCCESSFUL
Version = 3.4.2
Compile date = 22 Sep 2022

Compile options:
MPIFC = /opt/intel/oneapi/mpi/latest/bin/mpif90
FLINK = $(MPIFC)
FMPI_LIB = -L/opt/intel/oneapi/mpi/latest/lib -lmpi
FMPI_INC = -I/opt/intel/oneapi/mpi/latest/include
FFLAGS = -O3
FLINKFLAGS = $(FFLAGS)
RAND = (none)


Please send feedbacks and/or the results of this run to:

NPB Development Team
Internet: npb@nas.nasa.gov

 

Regards, K. Kunita

Kuni
New Contributor I
436 Views

Hi Santosh,

 

Thank you for your quick response.

 

Following is the screen output of the requested commands.  It is executed on machine 1 (3rd Gen Xeon scallable Processor).

If you want to show same things on the other machine, please ask me. 

 

$ fi_info -l
psm2:
version: 113.20
mlx:
version: 1.4
psm3:
version: 1103.0
psm3:
version: 1102.0
ofi_rxm:
version: 113.20
verbs:
version: 113.20
verbs:
version: 113.20
tcp:
version: 113.20
sockets:
version: 113.20
shm:
version: 114.0
ofi_hook_noop:
version: 113.20

ibv_devinfo
hca_id: rdmap24s0f0
transport: InfiniBand (0)
fw_ver: 1.60
node_guid: 669d:99ff:feff:ff5e
sys_image_guid: 649d:99ff:ff5e:0000
vendor_id: 0x8086
vendor_part_id: 5522
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet

hca_id: irdma1
transport: InfiniBand (0)
fw_ver: 1.60
node_guid: 669d:99ff:feff:ff5f
sys_image_guid: 649d:99ff:ff5f:0000
vendor_id: 0x8086
vendor_part_id: 5522
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_DOWN (1)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet

 

[kkunita@svr4 NPB3.4-MPI]$ lspci | grep Mellanox


[kkunita@svr4 NPB3.4-MPI]$ lspci | grep Omini-Path

 

$ I_MPI_DEBUG=30 mpirun -n 4 ./bin/bt.B.x
[0] MPI startup(): Intel(R) MPI Library, Version 2021.7 Build 20221022 (id: f7b29a2495)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (342 MB per rank) * (4 local ranks) = 1368 MB total
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:29820:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: verbs (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: verbs (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: tcp (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: sockets (113.20)
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ZE not supported
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: shm (114.0)
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:29820:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: ofi_rxm (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: psm2 (113.20)
libfabric:29820:psm3:core:fi_prov_ini():752<info> build options: VERSION=1102.0=11.2.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: psm3 (1102.0)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: mlx (1.4)
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ZE not supported
libfabric:29820:psm3:core:fi_prov_ini():785<info> build options: VERSION=1103.0=11.3.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: psm3 (1103.0)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: ofi_hook_noop (113.20)
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:29820:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
[0] MPI startup(): max_ch4_vnis: 1, max_reg_eps 64, enable_sep 0, enable_shared_ctxs 0, do_av_insert 0
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
[0] MPI startup(): libfabric provider: psm3
[0] MPI startup(): detected psm3 provider, set device name to "psm3"
libfabric:29820:core:core:fi_fabric_():1423<info> Opened fabric: RoCE-192.168.17.0/24
libfabric:29820:core:core:ofi_shm_map():171<warn> shm_open failed
[0] MPI startup(): addrnamelen: 32
libfabric:29820:core:core:ofi_ns_add_local_name():370<warn> Cannot add local name - name server uninitialized
[0] MPI startup(): File "/opt/intel/oneapi/mpi/2021.7.1/etc/tuning_icx_shm-ofi_psm3_100.dat" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.7.1/etc/tuning_icx_shm-ofi_psm3.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): threading: num_pools: 1
[0] MPI startup(): threading: enable_sep: 0
[0] MPI startup(): threading: direct_recv: 1
[0] MPI startup(): threading: zero_op_flags: 0
[0] MPI startup(): threading: num_am_buffers: 1
[0] MPI startup(): tag bits available: 30 (TAG_UB value: 1073741823)
[0] MPI startup(): source bits available: 30 (Maximal number of rank: 1073741823)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 29820 svr4 {0,1,2,12,13,14}
[0] MPI startup(): 1 29821 svr4 {3,4,5,15,16,17}
[0] MPI startup(): 2 29822 svr4 {6,7,8,18,19,20}
[0] MPI startup(): 3 29823 svr4 {9,10,11,21,22,23}
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.7.1
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=30
[0] allocate handle (kind=1, size=744, direct_size=8, indirect_size=1) ptr=0x7f2002f57d80
[0] allocate handle (kind=2, size=40, direct_size=8, indirect_size=1) ptr=0x7f10000d5900


NAS Parallel Benchmarks 3.4 -- BT Benchmark

No input file inputbt.data. Using compiled defaults
Size: 102x 102x 102 (class B)
Iterations: 200 dt: 0.0003000
Total number of processes: 4

Time step 1
Time step 20
Time step 40
Time step 60
Time step 80
Time step 100
Time step 120
Time step 140
Time step 160
Time step 180
Time step 200
Verification being performed for class B
accuracy setting for epsilon = 0.1000000000000E-07
Comparison of RMS-norms of residual
1 0.1423359722929E+04 0.1423359722929E+04 0.1070287152945E-13
2 0.9933052259015E+02 0.9933052259015E+02 0.7153317200312E-15
3 0.3564602564454E+03 0.3564602564454E+03 0.5900255245348E-14
4 0.3248544795908E+03 0.3248544795908E+03 0.9798945854817E-14
5 0.3270754125466E+04 0.3270754125466E+04 0.1223502756335E-13
Comparison of RMS-norms of solution error
1 0.5296984714094E+02 0.5296984714094E+02 0.9389868800427E-15
2 0.4463289611567E+01 0.4463289611567E+01 0.1293476388601E-13
3 0.1312257334221E+02 0.1312257334221E+02 0.1258908460682E-13
4 0.1200692532356E+02 0.1200692532356E+02 0.6805440394643E-14
5 0.1245957615104E+03 0.1245957615104E+03 0.1003690013030E-13
Verification Successful


BT Benchmark Completed.
Class = B
Size = 102x 102x 102
Iterations = 200
Time in seconds = 52.97
Total processes = 4
Active processes= 4
Mop/s total = 13256.00
Mop/s/process = 3314.00
Operation type = floating point
Verification = SUCCESSFUL
Version = 3.4.2
Compile date = 22 Sep 2022

Compile options:
MPIFC = /opt/intel/oneapi/mpi/latest/bin/mpif90
FLINK = $(MPIFC)
FMPI_LIB = -L/opt/intel/oneapi/mpi/latest/lib -lmpi
FMPI_INC = -I/opt/intel/oneapi/mpi/latest/include
FFLAGS = -O3
FLINKFLAGS = $(FFLAGS)
RAND = (none)


Please send feedbacks and/or the results of this run to:

NPB Development Team
Internet: npb@nas.nasa.gov

 

Regards, K. Kunita

Kuni
New Contributor I
436 Views

Hi Santosh,

 

Thank you for your quick response.

 

Following is the screen output of the requested commands.  It is executed on machine 1 (3rd Gen Xeon scallable Processor).

If you want to show same things on the other machine, please ask me. 

 

$ fi_info -l
psm2:
version: 113.20
mlx:
version: 1.4
psm3:
version: 1103.0
psm3:
version: 1102.0
ofi_rxm:
version: 113.20
verbs:
version: 113.20
verbs:
version: 113.20
tcp:
version: 113.20
sockets:
version: 113.20
shm:
version: 114.0
ofi_hook_noop:
version: 113.20

ibv_devinfo
hca_id: rdmap24s0f0
transport: InfiniBand (0)
fw_ver: 1.60
node_guid: 669d:99ff:feff:ff5e
sys_image_guid: 649d:99ff:ff5e:0000
vendor_id: 0x8086
vendor_part_id: 5522
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet

hca_id: irdma1
transport: InfiniBand (0)
fw_ver: 1.60
node_guid: 669d:99ff:feff:ff5f
sys_image_guid: 649d:99ff:ff5f:0000
vendor_id: 0x8086
vendor_part_id: 5522
hw_ver: 0x2
phys_port_cnt: 1
port: 1
state: PORT_DOWN (1)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 1
port_lmc: 0x00
link_layer: Ethernet

 

[kkunita@svr4 NPB3.4-MPI]$ lspci | grep Mellanox


[kkunita@svr4 NPB3.4-MPI]$ lspci | grep Omini-Path

 

$ I_MPI_DEBUG=30 mpirun -n 4 ./bin/bt.B.x
[0] MPI startup(): Intel(R) MPI Library, Version 2021.7 Build 20221022 (id: f7b29a2495)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (342 MB per rank) * (4 local ranks) = 1368 MB total
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:29820:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: verbs (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: verbs (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: tcp (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: sockets (113.20)
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ZE not supported
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: shm (114.0)
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:29820:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: ofi_rxm (113.20)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: psm2 (113.20)
libfabric:29820:psm3:core:fi_prov_ini():752<info> build options: VERSION=1102.0=11.2.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: psm3 (1102.0)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: mlx (1.4)
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ofi_hmem_init():222<info> Hmem iface FI_HMEM_ZE not supported
libfabric:29820:psm3:core:fi_prov_ini():785<info> build options: VERSION=1103.0=11.3.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: psm3 (1103.0)
libfabric:29820:core:core:ofi_register_provider():474<info> registering provider: ofi_hook_noop (113.20)
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:29820:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:29820:core:core:ze_hmem_dl_init():422<warn> Failed to dlopen libze_loader.so
libfabric:29820:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;psm3 layering
libfabric:29820:core:core:ofi_layering_ok():1001<info> Need core provider, skipping ofi_rxm
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;sockets layering
libfabric:29820:core:core:ofi_layering_ok():1007<info> Skipping util;shm layering
[0] MPI startup(): max_ch4_vnis: 1, max_reg_eps 64, enable_sep 0, enable_shared_ctxs 0, do_av_insert 0
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
libfabric:29820:core:core:fi_getinfo_():1138<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:29820:core:core:fi_getinfo_():1201<info> Start regular provider search because provider with the highest priority psm2 can not be initialized
[0] MPI startup(): libfabric provider: psm3
[0] MPI startup(): detected psm3 provider, set device name to "psm3"
libfabric:29820:core:core:fi_fabric_():1423<info> Opened fabric: RoCE-192.168.17.0/24
libfabric:29820:core:core:ofi_shm_map():171<warn> shm_open failed
[0] MPI startup(): addrnamelen: 32
libfabric:29820:core:core:ofi_ns_add_local_name():370<warn> Cannot add local name - name server uninitialized
[0] MPI startup(): File "/opt/intel/oneapi/mpi/2021.7.1/etc/tuning_icx_shm-ofi_psm3_100.dat" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.7.1/etc/tuning_icx_shm-ofi_psm3.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): threading: num_pools: 1
[0] MPI startup(): threading: enable_sep: 0
[0] MPI startup(): threading: direct_recv: 1
[0] MPI startup(): threading: zero_op_flags: 0
[0] MPI startup(): threading: num_am_buffers: 1
[0] MPI startup(): tag bits available: 30 (TAG_UB value: 1073741823)
[0] MPI startup(): source bits available: 30 (Maximal number of rank: 1073741823)
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 29820 svr4 {0,1,2,12,13,14}
[0] MPI startup(): 1 29821 svr4 {3,4,5,15,16,17}
[0] MPI startup(): 2 29822 svr4 {6,7,8,18,19,20}
[0] MPI startup(): 3 29823 svr4 {9,10,11,21,22,23}
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.7.1
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=30
[0] allocate handle (kind=1, size=744, direct_size=8, indirect_size=1) ptr=0x7f2002f57d80
[0] allocate handle (kind=2, size=40, direct_size=8, indirect_size=1) ptr=0x7f10000d5900


NAS Parallel Benchmarks 3.4 -- BT Benchmark

No input file inputbt.data. Using compiled defaults
Size: 102x 102x 102 (class B)
Iterations: 200 dt: 0.0003000
Total number of processes: 4

Time step 1
Time step 20
Time step 40
Time step 60
Time step 80
Time step 100
Time step 120
Time step 140
Time step 160
Time step 180
Time step 200
Verification being performed for class B
accuracy setting for epsilon = 0.1000000000000E-07
Comparison of RMS-norms of residual
1 0.1423359722929E+04 0.1423359722929E+04 0.1070287152945E-13
2 0.9933052259015E+02 0.9933052259015E+02 0.7153317200312E-15
3 0.3564602564454E+03 0.3564602564454E+03 0.5900255245348E-14
4 0.3248544795908E+03 0.3248544795908E+03 0.9798945854817E-14
5 0.3270754125466E+04 0.3270754125466E+04 0.1223502756335E-13
Comparison of RMS-norms of solution error
1 0.5296984714094E+02 0.5296984714094E+02 0.9389868800427E-15
2 0.4463289611567E+01 0.4463289611567E+01 0.1293476388601E-13
3 0.1312257334221E+02 0.1312257334221E+02 0.1258908460682E-13
4 0.1200692532356E+02 0.1200692532356E+02 0.6805440394643E-14
5 0.1245957615104E+03 0.1245957615104E+03 0.1003690013030E-13
Verification Successful


BT Benchmark Completed.
Class = B
Size = 102x 102x 102
Iterations = 200
Time in seconds = 52.97
Total processes = 4
Active processes= 4
Mop/s total = 13256.00
Mop/s/process = 3314.00
Operation type = floating point
Verification = SUCCESSFUL
Version = 3.4.2
Compile date = 22 Sep 2022

Compile options:
MPIFC = /opt/intel/oneapi/mpi/latest/bin/mpif90
FLINK = $(MPIFC)
FMPI_LIB = -L/opt/intel/oneapi/mpi/latest/lib -lmpi
FMPI_INC = -I/opt/intel/oneapi/mpi/latest/include
FFLAGS = -O3
FLINKFLAGS = $(FFLAGS)
RAND = (none)


Please send feedbacks and/or the results of this run to:

NPB Development Team
Internet: npb@nas.nasa.gov

 

Regards, K. Kunita

Kuni
New Contributor I
449 Views

Hi Santosh,

 

Thank you for your quick response.

 

It's strange. I tried to show the console log here to answer to your question. But it can not be shown after I did "Post Reply" . Is there some length limitation?

 

Anyway, I attached the text log file whch can show the asnwer to your question. Please look at.

Kuni
New Contributor I
389 Views

Hi,  Santosh,

 

Oh, now I can see the replies what I made and could not see. Then there are 3 same (almost) replies are shown. Please ignore those and see the attached file for your question.

 

Regards, K. Kunita

SantoshY_Intel
Moderator
361 Views

Hi,


Thanks for providing all the requested details.


We are working on your issue & will get back to you soon.


Thanks & regards,

Santosh


Kuni
New Contributor I
196 Views

Do you have any update? Could you tell me if you can reproduce the symtom?  If you want to get the additional information from me, please tell me.

 

Regard, K. Kunita

SantoshY_Intel
Moderator
188 Views

Hi,

 

Sorry for the delay.

 

Could you please let us know if you could run your application without MPI with a single process using the command below?

./bin/bt.B.x 

 

Thanks & Regards,

Santosh

 

Kuni
New Contributor I
174 Views

Yes, I can run it without mpi. 

 

Regards, K. Kunita

SantoshY_Intel
Moderator
162 Views

Hi,

 

Thanks for the confirmation.

 

We couldn't reproduce your issue as we don't have access to the exact infrastructure.

 

We suggest you access the Intel Devcloud & do experiments there. Please get back to us if you still face the issue.

 

Thanks & Regards,

Santosh

 

 

 

Kuni
New Contributor I
146 Views

Is Intel Devcloud vitual machine environment? If so, it is meaningless to try it.  As I showed, The symptom does not happen on vitual machine environment. Only happen with no-virtual machine environment. Could you tell me, how do you tried to reproduce the case (environment information, processor, OS, memory size, NIC (and driver), version of Intel MPI, NPB version, etc..), if you tried same things as me and you could not see the issue, it may be a solution for me or may give something to help to find the cause of the issue.

 

Regard, K. Kunita

 

SantoshY_Intel
Moderator
129 Views

Hi,


>>>"Is Intel Devcloud virtual machine environment?"

No, you can try experimenting on Intel Devcloud & get back to us if you face the same issue.


Thanks & Regards,

Santosh


Kuni
New Contributor I
108 Views

I tired to login to Intel devcloud. I found that the CPU is skylake. The problem is not occured with skylake based Xeon. I only showed with 3rd Gen. Xeon (Ice lake). Then I think that it is meaning-less to try dev clould with 2nd Gen. Xeon. Did you tried to reproduce my problem with 3rd Gen. Xeon scalable processor? I showed the issue only with Intel Xeon Silver 4310 Processor and Xeon Silver 4309Y Processor. I could not see the issue with Intel Xeon silver 4214R.

 

Regards, K. Kunita

Reply