Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7287 Discussions

Using DFTI with larger numbers of processors

Justin_D_1
Beginner
1,939 Views
I've written an MPI code which is using the DFTI interface to compute FFT's . It's a domain decomposition type of problem
where each processor solves its own group of FFTs. Everything works fine for NP=1,2,32,64,128 but fails
when NP=256 with an error which looks like:

DFTI_MKL_INTERNAL_ERROR

The code I'm using is the same regardless of the number of processors (the FFT function itself is just called less often).
The code which fails is the commit descriptor line and it fails on the first instance of being called:

type(DFTI_DESCRIPTOR), POINTER :: DFTI_HANDLE
...
STATUS = DftiCreateDescriptor( DFTI_HANDLE, DFTI_DOUBLE, DFTI_COMPLEX, 1, 192)
STATUS = DftiCommitDescriptor( DFTI_HANDLE )

I've tried both statically and dynamically linking, neither help and I'm using the sequential (num threads = 1) version.

Static
-i_dynamic -lmkl_core -lmkl_sequential -lmkl_intel_lp64

Dynamic
#$MKLPATH/libmkl_solver_lp64_sequential.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_sequential.a $MKLPATH/libmkl_core.a -Wl,--end-group -i_dynamic

Also, things mostly work o.k. for a smaller number of FFT points, e.g. 32, but it doesn't work for 192 or 256.

I've compiled with "-check all" and nothing is found...so I think the code is ok.

Does this problem sound familiar to anyone?

thx

jrd
0 Kudos
7 Replies
Gennady_F_Intel
Moderator
1,939 Views

Davis, what MKL and MPI versions you are using?
--Gennady
0 Kudos
Justin_D_1
Beginner
1,939 Views

Davis, what MKL and MPI versions you are using?
--Gennady

ifort Intel Fortran Compiler for applications running on Intel 64, Version 10.1 Build 20080312 Package ID: l_fc_p_10.1.015

MKL 10.0.2.018

MPI mvapich_intel10-0.9.9 currently, but also tried openmpi_intel-1.2.7

Also, I stripped out everything in my program so that all it does it commit and then free the descriptor. This does work.
So it sort of looks like a stack limit size problem...within the shell my stack is unlimited...but perhaps there is
some environment stack variable that needs to be set...I tried setting KMP_STACKSIZE large per a previous post I saw:

KMP_STACKSIZE=10000000000
export KMP_STACKSIZE

but that did not help either.


0 Kudos
Vladimir_Petrov__Int
New Contributor III
1,939 Views
Davis,

My question may seem strange to you but...
Are you sure all your MPI processes are actually run on their respective nodes?
To see the nodes on which you are actually running you may replace the name of your executable file with "uname -n".

Best regards,
-Vladimir
0 Kudos
Justin_D_1
Beginner
1,939 Views
Davis,

My question may seem strange to you but...
Are you sure all your MPI processes are actually run on their respective nodes?
To see the nodes on which you are actually running you may replace the name of your executable file with "uname -n".

Best regards,
-Vladimir

I am already requesting that MPI provide the machine name, so I can check this fairly easily. For a 256 simulation, I am using
120 unique physical machines (4 cores per machine). Of those 120 machines, Of those 120:

1 core per machine 38
2 28
3 54

Is that what you were looking for?

0 Kudos
Dmitry_B_Intel
Employee
1,939 Views

Davis,

If you don't link with libiomp5 then perhaps setting KMP_STACKSIZE has no effect.

The version of MKL that you use has two memory leak problems in DFTI that are fixed in later releases. The problems may hypothetically cause DftiCommitDescriptor to produce DFTI_MKL_INTERNAL_ERROR in a long run or in a tight memory. The memory leak may only accumulate if DftiCreate/Commit/Compute/Free is called in a loop. If the descriptor is created a few times, then this likely is not the cause.


Thanks
Dima

0 Kudos
Justin_D_1
Beginner
1,939 Views

Davis,

If you don't link with libiomp5 then perhaps setting KMP_STACKSIZE has no effect.

The version of MKL that you use has two memory leak problems in DFTI that are fixed in later releases. The problems may hypothetically cause DftiCommitDescriptor to produce DFTI_MKL_INTERNAL_ERROR in a long run or in a tight memory. The memory leak may only accumulate if DftiCreate/Commit/Compute/Free is called in a loop. If the descriptor is created a few times, then this likely is not the cause.


Thanks
Dima


The program is crashing on its first call to the Intel MKL libraries...so there is no loop to accumulate memory.

OK, I'll try upgrading MKL.
0 Kudos
Gennady_F_Intel
Moderator
1,939 Views

Davis,

If you don't link with libiomp5 then perhaps setting KMP_STACKSIZE has no effect.

The version of MKL that you use has two memory leak problems in DFTI that are fixed in later releases. The problems may hypothetically cause DftiCommitDescriptor to produce DFTI_MKL_INTERNAL_ERROR in a long run or in a tight memory. The memory leak may only accumulate if DftiCreate/Commit/Compute/Free is called in a loop. If the descriptor is created a few times, then this likely is not the cause.


Thanks
Dima


The program is crashing on its first call to the Intel MKL libraries...so there is no loop to accumulate memory.

OK, I'll try upgrading MKL.

Davis. please let us know the probelm will still with the new version.
--Gennady
0 Kudos
Reply