Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2226 Discussions

Performance degration with larger message on knl(>128M)

Zhoulong_J_Intel
Employee
734 Views

Hi, 

    When I ran with IMPI benchmark, it always got an obvious performance drop when buffer size>128MB with OFI, is this reasonable or there is some configuration? Thanks

mpirun -genv I_MPI_STATS=ipm    -genv I_MPI_FABRICS=tmi -n 2  -ppn 1 -f hostfile IMB-MPI1 -msglog 20:29 -iter 20000,1000 uniband -time 1000000  -mem 2

#---------------------------------------------------
# Benchmarking Uniband 
# #processes = 2 
#---------------------------------------------------
       #bytes #repetitions   Mbytes/sec      Msg/sec
            0        20000         0.00       607266
      1048576         1000      8356.72         7970
      2097152          500      8847.71         4219
      4194304          250      9295.13         2216
      8388608          125      9205.23         1097
     16777216           62      9498.35          566
     33554432           31      9577.55          285
     67108864           15      9564.31          143
    134217728            7      9523.83           71
    268435456            3      2700.73           10   <-----------performance dropped much
    536870912            1      3514.32            7

 

0 Kudos
1 Reply
Mikhail_S_Intel
Employee
734 Views

Hi Zhoulong,

It is expected that you see drop starting after 128 MB. It is related with memory registration mechanism on OPA driver level. You need to adjust OPA driver parameters for such large messages. OPA driver parameters can be found at /sys/module/hfi1/parameters.

Try the following:

  • set max_mtu=10240
  • increase cache_size from 256 to higher value (for example 512 or 1024)
  • decrease num_user_contexts (for example 16)
  • use huge pages in your application

​More information about OPA performance tuning can be found here:
https://www.intel.com/content/dam/support/us/en/documents/network-and-i-o/fabric-products/Intel_OP_Performance_Tuning_UG_H93143_v10_0.pdf
 

0 Kudos
Reply