<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Error with MUMPS when running a large model in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Error-with-MUMPS-when-running-a-large-model/m-p/1609657#M11764</link>
    <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/365167"&gt;@Guoqi_Ma&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;sorry with the information provided we can neither help you nor reproduce your issue.&lt;BR /&gt;Also this error looks like an error in MUMPS, have you reached out to the MUMPS developers?&lt;BR /&gt;&lt;BR /&gt;PS:&lt;BR /&gt;2023.2 is too old, please use 2024.2.&lt;/P&gt;</description>
    <pubDate>Tue, 25 Jun 2024 13:59:18 GMT</pubDate>
    <dc:creator>TobiasK</dc:creator>
    <dc:date>2024-06-25T13:59:18Z</dc:date>
    <item>
      <title>Error with MUMPS when running a large model</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Error-with-MUMPS-when-running-a-large-model/m-p/1609347#M11761</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Hi, recently, I encounter a error when I run a large model using up to 20 HPC nodes,&amp;nbsp; however, when I run a small model e.g. 2 nodes, errors are gone.&amp;nbsp; Have anyone met this error before ? Thanks very much.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Abort(1687183) on node 84 (rank 84 in comm 0): Fatal error in internal_Iprobe: Other MPI error, error stack:&lt;BR /&gt;internal_Iprobe(14309).........: MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, comm=0x84000006, flag=0x7ffe8b171570, status=0x7ffe8b171990) failed&lt;BR /&gt;MPID_Iprobe(389)...............:&lt;BR /&gt;MPIDI_Progress_test(105).......:&lt;BR /&gt;MPIDI_OFI_handle_cq_error(1127): OFI poll failed (ofi_events.c:1127:MPIDI_OFI_handle_cq_error:Transport endpoint is not connected)&lt;BR /&gt;Abort(405913231) on node 47 (rank 47 in comm 0): Fatal error in internal_Iprobe: Other MPI error, error stack:&lt;BR /&gt;internal_Iprobe(14309).........: MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, comm=0x84000006, flag=0x7ffe001734c0, status=0x7ffe001738e0) failed&lt;BR /&gt;MPID_Iprobe(385)...............:&lt;BR /&gt;MPIDI_iprobe_safe(246).........:&lt;BR /&gt;MPIDI_iprobe_unsafe(72)........:&lt;BR /&gt;MPIDIG_mpi_iprobe(48)..........:&lt;BR /&gt;MPIDI_Progress_test(105).......:&lt;BR /&gt;MPIDI_OFI_handle_cq_error(1127): OFI poll failed (ofi_events.c:1127:MPIDI_OFI_handle_cq_error:Transport endpoint is not connected)&lt;BR /&gt;Abort(672775823) on node 79 (rank 79 in comm 0): Fatal error in internal_Iprobe: Other MPI error, error stack:&lt;BR /&gt;internal_Iprobe(14309).........: MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, comm=0x84000006, flag=0x7ffce3ea9430, status=0x7ffce3ea9850) failed&lt;BR /&gt;MPID_Iprobe(389)...............:&lt;BR /&gt;MPIDI_Progress_test(105).......:&lt;BR /&gt;MPIDI_OFI_handle_cq_error(1127): OFI poll failed (ofi_events.c:1127:MPIDI_OFI_handle_cq_error:Transport endpoint is not connected)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Compile commands:&lt;/P&gt;&lt;P&gt;#!/bin/bash&lt;BR /&gt;#SBATCH --nodes=20&lt;BR /&gt;#SBATCH --ntasks=150&lt;BR /&gt;#SBATCH --partition=prod&lt;BR /&gt;#SBATCH --exclusive&lt;BR /&gt;#SBATCH --job-name=MumS4&lt;BR /&gt;#SBATCH --time=8:00:00&lt;BR /&gt;#SBATCH -e mumps.%j.err&lt;BR /&gt;#SBATCH --output=MOUT.%j.out&lt;BR /&gt;#SBATCH --account=kunf0069&lt;BR /&gt;module purge&lt;BR /&gt;module load intel/2023.2-gcc-9.4&lt;BR /&gt;module load impi/2021.10.0&lt;BR /&gt;module load mumps/5.4.1&lt;/P&gt;&lt;P&gt;filename=MUVSC3thrust_S&lt;BR /&gt;mpiifort -O2 -xHost -nofor-main -DBLR_MT -qopenmp -c $filename.f90 -o $filename.o&lt;/P&gt;&lt;P&gt;mpiifort -o $filename -O2 -xHost -nofor-main -qopenmp $filename.o -lcmumps -lmumps_common -lmpi -lpord -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread -lparmetis -lmetis -lptesmumps -lptscotch -lptscotcherr -lscotch&lt;/P&gt;&lt;P&gt;export OMP_NUM_THREADS=1&lt;BR /&gt;mpirun -np 150 ./${filename} |tee MUMPS.log&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Jun 2024 15:49:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Error-with-MUMPS-when-running-a-large-model/m-p/1609347#M11761</guid>
      <dc:creator>Guoqi_Ma</dc:creator>
      <dc:date>2024-06-24T15:49:01Z</dc:date>
    </item>
    <item>
      <title>Re: Error with MUMPS when running a large model</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Error-with-MUMPS-when-running-a-large-model/m-p/1609657#M11764</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/365167"&gt;@Guoqi_Ma&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;sorry with the information provided we can neither help you nor reproduce your issue.&lt;BR /&gt;Also this error looks like an error in MUMPS, have you reached out to the MUMPS developers?&lt;BR /&gt;&lt;BR /&gt;PS:&lt;BR /&gt;2023.2 is too old, please use 2024.2.&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jun 2024 13:59:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Error-with-MUMPS-when-running-a-large-model/m-p/1609657#M11764</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-06-25T13:59:18Z</dc:date>
    </item>
  </channel>
</rss>

