<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Deadlock Problem when using the Cluster FFT in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156356#M27604</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have run a massive simulation using MPI on distributed memory supercomputers&lt;BR /&gt;(FUJITSU Server PRIMERGY CX2550 M4 × 880)&lt;BR /&gt;and compiled with the intel/2018.2.046 Fortran compiler.&lt;/P&gt;&lt;P&gt;I have deadlock problems when using the Cluster FFT and the Available Auxiliary Functions&lt;BR /&gt;(MKL_CDFT_ScatterData and MKL_CDFT_GatherData) and the performance of the simulation is too slow.&lt;/P&gt;&lt;P&gt;The simulation is for solving Navier–Stokes equations and&lt;BR /&gt;3D(X, Y, and Z) arrays necessary to solve the equations.&lt;BR /&gt;Since in the simulation boundary conditions of the Y and Z directions are periodic,&lt;BR /&gt;I applied 2D Cluster FFT in the two directions and iterated the calculation along the other direction X as below.&lt;/P&gt;&lt;P&gt;==============================================&lt;BR /&gt;STATUS = DftiCreateDescriptorDM(MKL_COMM,DESC,DFTI_DOUBLE,DFTI_COMPLEX,2,LENGTHS)&lt;/P&gt;&lt;P&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_SIZE,SIZE)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_NX,NXX)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_X_START,START_X)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_NX,NX_OUT)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_X_START,START_X_OUT)&lt;BR /&gt;ALLOCATE(LOCAL(SIZE), WORK(SIZE), STAT=STATUS)&lt;BR /&gt;STATUS = DftiSetValueDM(DESC,DFTI_PLACEMENT,DFTI_NOT_INPLACE)&lt;/P&gt;&lt;P&gt;DO I = 1, Nx-1&lt;/P&gt;&lt;P&gt;&amp;nbsp;ALLOCATE(X_IN(M,N))&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;X_IN(J,K) = DCMPLX(A(I,J,K),0d0)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&lt;/P&gt;&lt;P&gt;&amp;nbsp;STATUS = DftiCommitDescriptorDM(DESC)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_SCATTERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,LOCAL)&amp;nbsp;&lt;BR /&gt;&amp;nbsp;STATUS = DftiComputeForwardDM(DESC,LOCAL,WORK)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_GATHERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,WORK)&lt;/P&gt;&lt;P&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;T_F1(I,J,K) = X_IN(J,K)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DEALLOCATE(X_IN)&lt;/P&gt;&lt;P&gt;END DO&lt;/P&gt;&lt;P&gt;DEALLOCATE(LOCAL, WORK)&lt;/P&gt;&lt;P&gt;~~~~~~~~~~~~~~~~~~~&lt;BR /&gt;&amp;lt;SOME CALCULATIONS&amp;gt;&lt;BR /&gt;~~~~~~~~~~~~~~~~~~~&lt;/P&gt;&lt;P&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_SIZE,SIZE)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_NX,NX_OUT)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_X_START,START_X_OUT)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_NX,NXX)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_X_START,START_X)&lt;BR /&gt;ALLOCATE(LOCAL(SIZE), WORK(SIZE), STAT=STATUS)&lt;BR /&gt;SCALE = 1.0_8/(N*M)&lt;BR /&gt;STATUS = DftiSetValueDM(DESC,DFTI_BACKWARD_SCALE,SCALE)&lt;/P&gt;&lt;P&gt;DO I = 1, Nx-1&lt;/P&gt;&lt;P&gt;&amp;nbsp;ALLOCATE(X_IN(M,N))&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;X_IN(J,K) = A(I,J,K)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;STATUS = DftiCommitDescriptorDM(DESC)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_SCATTERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,WORK)&lt;BR /&gt;&amp;nbsp;STATUS = DftiComputeBackwardDM(DESC,WORK,LOCAL)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_GATHERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,LOCAL)&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;P(I,J,K) = REAL(X_IN(J,K))&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&amp;nbsp;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DEALLOCATE(X_IN)&lt;/P&gt;&lt;P&gt;END DO&lt;/P&gt;&lt;P&gt;DEALLOCATE(LOCAL, WORK)&lt;/P&gt;&lt;P&gt;STATUS = DftiFreeDescriptorDM(DESC)&lt;BR /&gt;==============================================&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I programed this simulation based on the 'cdft_example_support' and 'dm_complex_2d_double_ex2' provided by the Intel MKL.&lt;BR /&gt;After using -check_mpi, I've got the following errors when calculating the first Do loop.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;==============================================&lt;BR /&gt;[0] ERROR: no progress observed in any process for over 11:12 minutes, aborting application&lt;BR /&gt;[0] WARNING: starting premature shutdown&lt;/P&gt;&lt;P&gt;[0] ERROR: GLOBAL:DEADLOCK:HARD: fatal error&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; Application aborted because no progress was observed for over 11:12 minutes,&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; check for real deadlock (cycle of processes waiting for data) or&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; potential deadlock (processes sending data to each other and getting blocked&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; because the MPI might wait for the corresponding receive).&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; [0] no progress observed for over 11:12 minutes, process is currently in MPI call:&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; mpi_gather_(*sendbuf=0x762610, sendcount=2, sendtype=MPI_INTEGER, *recvbuf=0x2b9c9acc4b80, recvcount=2, recvtype=MPI_INTEGER, root=0, comm=MPI_COMM_WORLD, *ierr=0x7fffca56ca50)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; module_mpi_mp_mkl_cdft_scatterdata_d_ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; press_ffttdma_ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; rk3_uvwpc_ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MAIN__ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; main (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; __libc_start_main (/usr/lib64/libc-2.17.so)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (/home/~)&lt;BR /&gt;.&lt;BR /&gt;.&lt;BR /&gt;.&lt;BR /&gt;[0] INFO: GLOBAL:DEADLOCK:HARD: found 1 time (1 error + 0 warnings), 0 reports were suppressed&lt;BR /&gt;[0] INFO: Found 1 problem (1 error + 0 warnings), 0 reports were suppressed.&lt;BR /&gt;==============================================&lt;/P&gt;&lt;P&gt;I have tried to solve this deadlock and being slow problems for several weeks but I can't fix it.&lt;BR /&gt;I would greatly appreciate any help or some insight on this problems.&lt;/P&gt;&lt;P&gt;Best regards&lt;/P&gt;&lt;P&gt;YU,&lt;/P&gt;</description>
    <pubDate>Mon, 12 Nov 2018 10:34:46 GMT</pubDate>
    <dc:creator>YU__Jihong</dc:creator>
    <dc:date>2018-11-12T10:34:46Z</dc:date>
    <item>
      <title>Deadlock Problem when using the Cluster FFT</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156356#M27604</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have run a massive simulation using MPI on distributed memory supercomputers&lt;BR /&gt;(FUJITSU Server PRIMERGY CX2550 M4 × 880)&lt;BR /&gt;and compiled with the intel/2018.2.046 Fortran compiler.&lt;/P&gt;&lt;P&gt;I have deadlock problems when using the Cluster FFT and the Available Auxiliary Functions&lt;BR /&gt;(MKL_CDFT_ScatterData and MKL_CDFT_GatherData) and the performance of the simulation is too slow.&lt;/P&gt;&lt;P&gt;The simulation is for solving Navier–Stokes equations and&lt;BR /&gt;3D(X, Y, and Z) arrays necessary to solve the equations.&lt;BR /&gt;Since in the simulation boundary conditions of the Y and Z directions are periodic,&lt;BR /&gt;I applied 2D Cluster FFT in the two directions and iterated the calculation along the other direction X as below.&lt;/P&gt;&lt;P&gt;==============================================&lt;BR /&gt;STATUS = DftiCreateDescriptorDM(MKL_COMM,DESC,DFTI_DOUBLE,DFTI_COMPLEX,2,LENGTHS)&lt;/P&gt;&lt;P&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_SIZE,SIZE)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_NX,NXX)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_X_START,START_X)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_NX,NX_OUT)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_X_START,START_X_OUT)&lt;BR /&gt;ALLOCATE(LOCAL(SIZE), WORK(SIZE), STAT=STATUS)&lt;BR /&gt;STATUS = DftiSetValueDM(DESC,DFTI_PLACEMENT,DFTI_NOT_INPLACE)&lt;/P&gt;&lt;P&gt;DO I = 1, Nx-1&lt;/P&gt;&lt;P&gt;&amp;nbsp;ALLOCATE(X_IN(M,N))&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;X_IN(J,K) = DCMPLX(A(I,J,K),0d0)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&lt;/P&gt;&lt;P&gt;&amp;nbsp;STATUS = DftiCommitDescriptorDM(DESC)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_SCATTERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,LOCAL)&amp;nbsp;&lt;BR /&gt;&amp;nbsp;STATUS = DftiComputeForwardDM(DESC,LOCAL,WORK)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_GATHERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,WORK)&lt;/P&gt;&lt;P&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;T_F1(I,J,K) = X_IN(J,K)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DEALLOCATE(X_IN)&lt;/P&gt;&lt;P&gt;END DO&lt;/P&gt;&lt;P&gt;DEALLOCATE(LOCAL, WORK)&lt;/P&gt;&lt;P&gt;~~~~~~~~~~~~~~~~~~~&lt;BR /&gt;&amp;lt;SOME CALCULATIONS&amp;gt;&lt;BR /&gt;~~~~~~~~~~~~~~~~~~~&lt;/P&gt;&lt;P&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_SIZE,SIZE)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_NX,NX_OUT)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_X_START,START_X_OUT)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_NX,NXX)&lt;BR /&gt;STATUS = DftiGetValueDM(DESC,CDFT_LOCAL_OUT_X_START,START_X)&lt;BR /&gt;ALLOCATE(LOCAL(SIZE), WORK(SIZE), STAT=STATUS)&lt;BR /&gt;SCALE = 1.0_8/(N*M)&lt;BR /&gt;STATUS = DftiSetValueDM(DESC,DFTI_BACKWARD_SCALE,SCALE)&lt;/P&gt;&lt;P&gt;DO I = 1, Nx-1&lt;/P&gt;&lt;P&gt;&amp;nbsp;ALLOCATE(X_IN(M,N))&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;X_IN(J,K) = A(I,J,K)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;STATUS = DftiCommitDescriptorDM(DESC)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_SCATTERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,WORK)&lt;BR /&gt;&amp;nbsp;STATUS = DftiComputeBackwardDM(DESC,WORK,LOCAL)&lt;BR /&gt;&amp;nbsp;STATUS = MKL_CDFT_GATHERDATA_D(COMM,ROOTRANK,ELEMENTSIZE,2,LENGTHS,X_IN,NXX,START_X,LOCAL)&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DO K = 1, N&lt;BR /&gt;&amp;nbsp;&amp;nbsp;DO J = 1, M&lt;BR /&gt;&amp;nbsp;&amp;nbsp;P(I,J,K) = REAL(X_IN(J,K))&lt;BR /&gt;&amp;nbsp;&amp;nbsp;END DO&lt;BR /&gt;&amp;nbsp;END DO&amp;nbsp;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;&amp;nbsp;DEALLOCATE(X_IN)&lt;/P&gt;&lt;P&gt;END DO&lt;/P&gt;&lt;P&gt;DEALLOCATE(LOCAL, WORK)&lt;/P&gt;&lt;P&gt;STATUS = DftiFreeDescriptorDM(DESC)&lt;BR /&gt;==============================================&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I programed this simulation based on the 'cdft_example_support' and 'dm_complex_2d_double_ex2' provided by the Intel MKL.&lt;BR /&gt;After using -check_mpi, I've got the following errors when calculating the first Do loop.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;==============================================&lt;BR /&gt;[0] ERROR: no progress observed in any process for over 11:12 minutes, aborting application&lt;BR /&gt;[0] WARNING: starting premature shutdown&lt;/P&gt;&lt;P&gt;[0] ERROR: GLOBAL:DEADLOCK:HARD: fatal error&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; Application aborted because no progress was observed for over 11:12 minutes,&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; check for real deadlock (cycle of processes waiting for data) or&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; potential deadlock (processes sending data to each other and getting blocked&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; because the MPI might wait for the corresponding receive).&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp; [0] no progress observed for over 11:12 minutes, process is currently in MPI call:&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; mpi_gather_(*sendbuf=0x762610, sendcount=2, sendtype=MPI_INTEGER, *recvbuf=0x2b9c9acc4b80, recvcount=2, recvtype=MPI_INTEGER, root=0, comm=MPI_COMM_WORLD, *ierr=0x7fffca56ca50)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; module_mpi_mp_mkl_cdft_scatterdata_d_ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; press_ffttdma_ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; rk3_uvwpc_ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MAIN__ (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; main (/home/~)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; __libc_start_main (/usr/lib64/libc-2.17.so)&lt;BR /&gt;[0] ERROR:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; (/home/~)&lt;BR /&gt;.&lt;BR /&gt;.&lt;BR /&gt;.&lt;BR /&gt;[0] INFO: GLOBAL:DEADLOCK:HARD: found 1 time (1 error + 0 warnings), 0 reports were suppressed&lt;BR /&gt;[0] INFO: Found 1 problem (1 error + 0 warnings), 0 reports were suppressed.&lt;BR /&gt;==============================================&lt;/P&gt;&lt;P&gt;I have tried to solve this deadlock and being slow problems for several weeks but I can't fix it.&lt;BR /&gt;I would greatly appreciate any help or some insight on this problems.&lt;/P&gt;&lt;P&gt;Best regards&lt;/P&gt;&lt;P&gt;YU,&lt;/P&gt;</description>
      <pubDate>Mon, 12 Nov 2018 10:34:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156356#M27604</guid>
      <dc:creator>YU__Jihong</dc:creator>
      <dc:date>2018-11-12T10:34:46Z</dc:date>
    </item>
    <item>
      <title>Jihang,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156357#M27605</link>
      <description>&lt;P&gt;Jihang,&lt;/P&gt;&lt;P&gt;I will look into this.&lt;/P&gt;&lt;P&gt;Pamela&lt;/P&gt;</description>
      <pubDate>Mon, 19 Nov 2018 16:33:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156357#M27605</guid>
      <dc:creator>Pamela_H_Intel</dc:creator>
      <dc:date>2018-11-19T16:33:51Z</dc:date>
    </item>
    <item>
      <title>Dear Pamela</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156358#M27606</link>
      <description>&lt;P&gt;Dear Pamela&lt;/P&gt;&lt;P&gt;Thank you, I desperately need your help.&lt;BR /&gt;Now I figure out&lt;BR /&gt;If the Nx (in the part of DO I = 1, Nx-1) is less than 600, the deadlock problem does not happen.&lt;BR /&gt;But I need Nx more than 900.&lt;BR /&gt;I have just started learning MPI and using library,&amp;nbsp; and can't find ways of solving problems.&lt;BR /&gt;I would appreciate greatly it if you could give me some suggestions.&lt;/P&gt;&lt;P&gt;Best regards&lt;/P&gt;&lt;P&gt;YU,&lt;/P&gt;</description>
      <pubDate>Tue, 20 Nov 2018 04:50:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156358#M27606</guid>
      <dc:creator>YU__Jihong</dc:creator>
      <dc:date>2018-11-20T04:50:30Z</dc:date>
    </item>
    <item>
      <title>Yu,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156359#M27607</link>
      <description>Yu,

I need you to scale down the code. For example create only 2 MPI processes See if the problem persists. 

Also, I wonder how long was running? I see "No progress observed for 11:12 min". If you don't already, capture start and end time . . . if you are using a version of Linux, you can just wrap your call in the time call (for example:  time ls; time sleep 2; time &lt;YOUR execution="" call="" line=""&gt;). This may help us discover if there is a mistake in your code (it didn't do anything but wait) or things were working until a data value went bad.

I look forward to hearing what you find.

Pamela&lt;/YOUR&gt;</description>
      <pubDate>Tue, 20 Nov 2018 23:27:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Deadlock-Problem-when-using-the-Cluster-FFT/m-p/1156359#M27607</guid>
      <dc:creator>Pamela_H_Intel</dc:creator>
      <dc:date>2018-11-20T23:27:11Z</dc:date>
    </item>
  </channel>
</rss>

