Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6977 Discussions

2D FFT with leading dimension divisible by 2048

teeter
Beginner
452 Views
The MKL user guide says that for best performance, 2D arrays where the leading dimension is divisble by 2048 should be avoided. Could someone please clarify the nature of this restriction, particularly for FFT?

For example, I have a 2D array that is 1500x1500 pixels. To use a radix-2 FFT implementation, the typical approach is to pad the array up to 2048x2048 and then run the FFT. But it seems that this is inefficient for MKL. So what would be the most efficient way to perform FFT on such an array?

Thanks.




0 Kudos
3 Replies
Dmitry_B_Intel
Employee
452 Views

Hi,

The origin of the problem is in the use ofthe set associative L2 cache. Leading dimension a multiple of row size of the cache incurs lots of interference misses (that is, data competing for the same associativity set).

Computing 1500x1500 array with leading dimension 1500 should be close to optimal for MKL. The best padding can be found empiricallywith amicro benchmark that should not be difficult to compose. It might happen that padding to, say, 1634 will be a few percent better.

Thanks,
Dima

0 Kudos
teeter
Beginner
452 Views
Hi Dima, thanks for the reply.

It is not immediately obvious why 1634 would be better than 1500. The user guide also suggests that for best performance, leading dimension values should be divisible by 16. Neither 1500 nor 1634 satisfy this. Was 1634 determined empircally?

My application needs to run FFT on an arbitrary M x N array. How do I go about arranging my data to get the best performance?

Thanks.


0 Kudos
Dmitry_B_Intel
Employee
452 Views

Hi,

I mentioned this strange number as an example. In my test it happened to give *slightly better* performance than 1500. So yes it is empirical number.

Re data arrangement the rule is to avoid leading dimension be multiple of 2048. So if M or N (depends on row-major or col-major memory layout you use), are multiple of that number it should be better if the array is padded in that dimension. Otherwise, it should be nearly optimal to not add padding.

Thanks,
Dima

0 Kudos
Reply