<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Bug in MKL 11.3.0.109 FFT in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016105#M19491</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am pretty sure I found a bug in the DFTI (FFT) routine of MKL.&lt;/P&gt;

&lt;P&gt;When doing a 2D complex-to-real FFT, for some array sizes, the FFT routine writes precisely 2 real values too many. In those cases, the FFT result is incorrect.&lt;/P&gt;

&lt;P&gt;This happens both for single- en double-precision invocations.&lt;/P&gt;

&lt;P&gt;For single-precision invocations, this happens precisely when: num_rows &amp;gt;= 30 and num_cols &amp;gt;= 22 and num_cols % 16 == 14.&lt;/P&gt;

&lt;P&gt;For double-precision invocations, this happens precisely when: num_rows &amp;gt;= 17 and num_cols &amp;gt;= 22 and num_cols % 8 == 6.&lt;/P&gt;

&lt;P&gt;A program that demonstrates the issue is shown below.&lt;/P&gt;

&lt;P&gt;Kind regards, Sidney Cadot&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;//////////////////////////
// InvestigateMklBug.cc //
//////////////////////////

#include &amp;lt;iostream&amp;gt;
#include &amp;lt;complex&amp;gt;
#include &amp;lt;cassert&amp;gt;
#include &amp;lt;vector&amp;gt;

#include &amp;lt;mkl.h&amp;gt;

void testcase(unsigned  num_rows, unsigned num_cols)
{
    // We will do a complex-to-real 2D FFT.
    //
    // The real array is    (num_rows x nul_cols).
    // The complex_array is (num_rows x nul_cols_complex), where nul_cols_complex == num_cols / 2 + 1.
    //
    // It turns out that, for some values of num_rows/num_cols, the FFT writes beyond the last entry
    // of the 'real_array'.
    //
    // We investigate this by allocating 'real_array' with a few elements (EXTRA_ENTRIES) too many.
    //
    // Prior to the FFT, we initialize 'real_array' with a GUARD_VALUE.
    //
    // After the FFT, we check the number of GUARD_VALUEs still present.
    //
    // If this is less than the number of EXTRA_ENTRIES, elements were overwritten that shouldn't have.

    const unsigned num_cols_complex = num_cols / 2 + 1;

    // setup DFTI descriptor

    DFTI_DESCRIPTOR_HANDLE descriptor;

    const MKL_LONG dimensions[2] = {num_rows, num_cols};
    MKL_LONG status = DftiCreateDescriptor(&amp;amp;descriptor, DFTI_SINGLE, DFTI_REAL, 2, dimensions);
    assert(status == DFTI_NO_ERROR);

    status = DftiSetValue(descriptor, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
    assert(status == DFTI_NO_ERROR);

    // The manual recommends setting this to DFTI_COMPLEX_COMPLEX.
    status = DftiSetValue(descriptor, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX);
    assert(status == DFTI_NO_ERROR);

    const MKL_LONG input_strides[3] = {0, num_cols_complex, 1};
    status = DftiSetValue(descriptor, DFTI_INPUT_STRIDES, input_strides);
    assert(status == DFTI_NO_ERROR);

    const MKL_LONG output_strides[3] = {0, num_cols, 1};
    status = DftiSetValue(descriptor, DFTI_OUTPUT_STRIDES, output_strides);
    assert(status == DFTI_NO_ERROR);

    const MKL_LONG thread_limit = 1;
    status = DftiSetValue(descriptor, DFTI_THREAD_LIMIT, thread_limit);
    assert(status == DFTI_NO_ERROR);

    status = DftiCommitDescriptor(descriptor);
    assert(status == DFTI_NO_ERROR);

    // Do inverse-fft 

    const unsigned EXTRA_ENTRIES = 64;
    const float    GUARD_VALUE   = 999.0;

    std::vector&amp;lt;std::complex&amp;lt;float&amp;gt;&amp;gt; complex_array(num_rows * num_cols_complex);
    std::vector             &amp;lt;float &amp;gt; real_array   (num_rows * num_cols + EXTRA_ENTRIES, GUARD_VALUE);

    // Execute the FFT and free the descriptor.

    status = DftiComputeBackward(descriptor, complex_array.data(), real_array.data());
    assert(status == DFTI_NO_ERROR);

    status = DftiFreeDescriptor(&amp;amp;descriptor);
    assert(status == DFTI_NO_ERROR);

    // Investigate whether the FFT wrote more entries than expected.

    unsigned count_guards = 0;
    for (unsigned i = 0; i &amp;lt; real_array.size(); ++i)
    {
        if (real_array&lt;I&gt; == GUARD_VALUE)
        {
            ++count_guards;
        }
    }

    assert(count_guards &amp;lt;= EXTRA_ENTRIES);

    bool problem_detected = (count_guards != EXTRA_ENTRIES); // They should be the same.

    if (problem_detected)
    {
        std::cout &amp;lt;&amp;lt; "num_rows " &amp;lt;&amp;lt; num_rows &amp;lt;&amp;lt; " num_cols " &amp;lt;&amp;lt; num_cols &amp;lt;&amp;lt; " num_cols_complex " &amp;lt;&amp;lt; num_cols_complex &amp;lt;&amp;lt; " --&amp;gt; wrote " &amp;lt;&amp;lt; (EXTRA_ENTRIES - count_guards) &amp;lt;&amp;lt; " real_array entries too many." &amp;lt;&amp;lt; std::endl;
    }
}

int main()
{
    for (unsigned num_rows = 1; num_rows &amp;lt;= 200; ++num_rows)
    {
        for (unsigned num_cols = 1; num_cols &amp;lt;= 200; ++num_cols)
        {
            testcase(num_rows, num_cols);
        }
    }
    return 0;
}
&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 13 Oct 2015 10:52:12 GMT</pubDate>
    <dc:creator>Sidney_Cadot</dc:creator>
    <dc:date>2015-10-13T10:52:12Z</dc:date>
    <item>
      <title>Bug in MKL 11.3.0.109 FFT</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016105#M19491</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am pretty sure I found a bug in the DFTI (FFT) routine of MKL.&lt;/P&gt;

&lt;P&gt;When doing a 2D complex-to-real FFT, for some array sizes, the FFT routine writes precisely 2 real values too many. In those cases, the FFT result is incorrect.&lt;/P&gt;

&lt;P&gt;This happens both for single- en double-precision invocations.&lt;/P&gt;

&lt;P&gt;For single-precision invocations, this happens precisely when: num_rows &amp;gt;= 30 and num_cols &amp;gt;= 22 and num_cols % 16 == 14.&lt;/P&gt;

&lt;P&gt;For double-precision invocations, this happens precisely when: num_rows &amp;gt;= 17 and num_cols &amp;gt;= 22 and num_cols % 8 == 6.&lt;/P&gt;

&lt;P&gt;A program that demonstrates the issue is shown below.&lt;/P&gt;

&lt;P&gt;Kind regards, Sidney Cadot&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;//////////////////////////
// InvestigateMklBug.cc //
//////////////////////////

#include &amp;lt;iostream&amp;gt;
#include &amp;lt;complex&amp;gt;
#include &amp;lt;cassert&amp;gt;
#include &amp;lt;vector&amp;gt;

#include &amp;lt;mkl.h&amp;gt;

void testcase(unsigned  num_rows, unsigned num_cols)
{
    // We will do a complex-to-real 2D FFT.
    //
    // The real array is    (num_rows x nul_cols).
    // The complex_array is (num_rows x nul_cols_complex), where nul_cols_complex == num_cols / 2 + 1.
    //
    // It turns out that, for some values of num_rows/num_cols, the FFT writes beyond the last entry
    // of the 'real_array'.
    //
    // We investigate this by allocating 'real_array' with a few elements (EXTRA_ENTRIES) too many.
    //
    // Prior to the FFT, we initialize 'real_array' with a GUARD_VALUE.
    //
    // After the FFT, we check the number of GUARD_VALUEs still present.
    //
    // If this is less than the number of EXTRA_ENTRIES, elements were overwritten that shouldn't have.

    const unsigned num_cols_complex = num_cols / 2 + 1;

    // setup DFTI descriptor

    DFTI_DESCRIPTOR_HANDLE descriptor;

    const MKL_LONG dimensions[2] = {num_rows, num_cols};
    MKL_LONG status = DftiCreateDescriptor(&amp;amp;descriptor, DFTI_SINGLE, DFTI_REAL, 2, dimensions);
    assert(status == DFTI_NO_ERROR);

    status = DftiSetValue(descriptor, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
    assert(status == DFTI_NO_ERROR);

    // The manual recommends setting this to DFTI_COMPLEX_COMPLEX.
    status = DftiSetValue(descriptor, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_COMPLEX);
    assert(status == DFTI_NO_ERROR);

    const MKL_LONG input_strides[3] = {0, num_cols_complex, 1};
    status = DftiSetValue(descriptor, DFTI_INPUT_STRIDES, input_strides);
    assert(status == DFTI_NO_ERROR);

    const MKL_LONG output_strides[3] = {0, num_cols, 1};
    status = DftiSetValue(descriptor, DFTI_OUTPUT_STRIDES, output_strides);
    assert(status == DFTI_NO_ERROR);

    const MKL_LONG thread_limit = 1;
    status = DftiSetValue(descriptor, DFTI_THREAD_LIMIT, thread_limit);
    assert(status == DFTI_NO_ERROR);

    status = DftiCommitDescriptor(descriptor);
    assert(status == DFTI_NO_ERROR);

    // Do inverse-fft 

    const unsigned EXTRA_ENTRIES = 64;
    const float    GUARD_VALUE   = 999.0;

    std::vector&amp;lt;std::complex&amp;lt;float&amp;gt;&amp;gt; complex_array(num_rows * num_cols_complex);
    std::vector             &amp;lt;float &amp;gt; real_array   (num_rows * num_cols + EXTRA_ENTRIES, GUARD_VALUE);

    // Execute the FFT and free the descriptor.

    status = DftiComputeBackward(descriptor, complex_array.data(), real_array.data());
    assert(status == DFTI_NO_ERROR);

    status = DftiFreeDescriptor(&amp;amp;descriptor);
    assert(status == DFTI_NO_ERROR);

    // Investigate whether the FFT wrote more entries than expected.

    unsigned count_guards = 0;
    for (unsigned i = 0; i &amp;lt; real_array.size(); ++i)
    {
        if (real_array&lt;I&gt; == GUARD_VALUE)
        {
            ++count_guards;
        }
    }

    assert(count_guards &amp;lt;= EXTRA_ENTRIES);

    bool problem_detected = (count_guards != EXTRA_ENTRIES); // They should be the same.

    if (problem_detected)
    {
        std::cout &amp;lt;&amp;lt; "num_rows " &amp;lt;&amp;lt; num_rows &amp;lt;&amp;lt; " num_cols " &amp;lt;&amp;lt; num_cols &amp;lt;&amp;lt; " num_cols_complex " &amp;lt;&amp;lt; num_cols_complex &amp;lt;&amp;lt; " --&amp;gt; wrote " &amp;lt;&amp;lt; (EXTRA_ENTRIES - count_guards) &amp;lt;&amp;lt; " real_array entries too many." &amp;lt;&amp;lt; std::endl;
    }
}

int main()
{
    for (unsigned num_rows = 1; num_rows &amp;lt;= 200; ++num_rows)
    {
        for (unsigned num_cols = 1; num_cols &amp;lt;= 200; ++num_cols)
        {
            testcase(num_rows, num_cols);
        }
    }
    return 0;
}
&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Oct 2015 10:52:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016105#M19491</guid>
      <dc:creator>Sidney_Cadot</dc:creator>
      <dc:date>2015-10-13T10:52:12Z</dc:date>
    </item>
    <item>
      <title>Hi Sidney,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016106#M19492</link>
      <description>&lt;P&gt;Hi Sidney,&lt;/P&gt;

&lt;P&gt;This is a known issue in MKL 11.3.0.109.&lt;/P&gt;

&lt;P&gt;The following knowledge base article explains how to obtain Intel MKL package with this and other issues fixed.&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 11pt; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: RU; mso-bidi-language: AR-SA;"&gt;&lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/595045"&gt;&lt;U&gt;&lt;FONT color="#0563c1"&gt;&lt;/FONT&gt;&lt;/U&gt;&lt;/A&gt;&lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/595045" target="_blank"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/595045&lt;/A&gt; &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="EN-US" style="color: rgb(31, 73, 125); font-family: &amp;quot;Calibri&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 11pt; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: RU; mso-bidi-language: AR-SA;"&gt;Evgueni&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Oct 2015 10:59:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016106#M19492</guid>
      <dc:creator>Evgueni_P_Intel</dc:creator>
      <dc:date>2015-10-13T10:59:06Z</dc:date>
    </item>
    <item>
      <title>Hi Evgueni</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016107#M19493</link>
      <description>&lt;P&gt;Hi Evgueni&lt;/P&gt;

&lt;P&gt;Well that's pretty much my diagnosis right there.&lt;/P&gt;

&lt;P&gt;Thanks for the pointer, I will ask for the hot-fix.&lt;/P&gt;

&lt;P&gt;Regards, Sidney&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Oct 2015 11:14:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Bug-in-MKL-11-3-0-109-FFT/m-p/1016107#M19493</guid>
      <dc:creator>Sidney_Cadot</dc:creator>
      <dc:date>2015-10-13T11:14:13Z</dc:date>
    </item>
  </channel>
</rss>

