<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MKL library produces different results depending on OpenMP thread count in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560025#M35691</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/299466"&gt;@jethro_tull&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;oneMKL uses OpenMP so the numbers of threads will impact the results.&amp;nbsp;This is a well know challenge in scientific computation. oneMKL covers this topic in our reference guide and provides some ways to control (links below), note that performance might be impacted in order to ensure reproducibility of results:&lt;/P&gt;&lt;H2&gt;Reproducibility Conditions (&lt;A href="https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/current/reproducibility-conditions.html" target="_self"&gt;link&lt;/A&gt;)&lt;/H2&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&lt;P&gt;To get reproducible results from run to run, ensure that the number of threads is fixed and constant. Specifically:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;If you are running your program with OpenMP* parallelization on different processors, explicitly specify the number of threads.&lt;/LI&gt;&lt;LI&gt;To ensure that your application has deterministic behavior with OpenMP* parallelization and does not adjust the number of threads dynamically at run time, set&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;MKL_DYNAMIC&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;OMP_DYNAMIC&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;to FALSE. This is especially needed if you are running your program on different systems.&lt;/LI&gt;&lt;LI&gt;If you are running your program with the Intel® Threading Building Blocks parallelization, numerical reproducibility is not guaranteed.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;More information on the oneMKL Developer Reference guide&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/current/conditional-numerical-reproducibility-control.html" target="_self"&gt;https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/current/conditional-numerical-reproducibility-control.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 04 Jan 2024 23:07:06 GMT</pubDate>
    <dc:creator>George_Silva_Intel</dc:creator>
    <dc:date>2024-01-04T23:07:06Z</dc:date>
    <item>
      <title>MKL library produces different results depending on OpenMP thread count</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1559629#M35689</link>
      <description>&lt;P&gt;On two different platforms, I have seen the MKL library produce different results depending on OpenMP thread count. The particular routine in question is the LAPACK CGETRS() routine. I am using Intel OneAPI rev. 2021.4.0 with the accompanying MKL revision. The two platforms where I saw this problem were:&lt;BR /&gt;Intel E5-2699v4 Broadwell, Suse Linux&lt;BR /&gt;Intel Xeon E5-4627, RHEL 7.9&lt;BR /&gt;Specifically, I input a matrix with dimensions of a few dozen rows and columns into CGETRS() with absolutely no explicit OpenMP in my source code. Then, when I set OMP_NUM_THREADS to 1, I get a particular result. But then I run again with OMP_NUM_THREADS set to 2 (or 3 or ...) and I get another result. The results differ by a small amount (1e-6 or less), but still, shouldn't they be bit-wise equivalent? I want to emphasize that there is no OpenMP in my actual source code.&lt;BR /&gt;Has anyone seen this before? Thanks.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jan 2024 21:26:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1559629#M35689</guid>
      <dc:creator>jethro_tull</dc:creator>
      <dc:date>2024-01-03T21:26:49Z</dc:date>
    </item>
    <item>
      <title>Re: MKL library produces different results depending on OpenMP thread count</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560025#M35691</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/299466"&gt;@jethro_tull&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;oneMKL uses OpenMP so the numbers of threads will impact the results.&amp;nbsp;This is a well know challenge in scientific computation. oneMKL covers this topic in our reference guide and provides some ways to control (links below), note that performance might be impacted in order to ensure reproducibility of results:&lt;/P&gt;&lt;H2&gt;Reproducibility Conditions (&lt;A href="https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/current/reproducibility-conditions.html" target="_self"&gt;link&lt;/A&gt;)&lt;/H2&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&lt;P&gt;To get reproducible results from run to run, ensure that the number of threads is fixed and constant. Specifically:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;If you are running your program with OpenMP* parallelization on different processors, explicitly specify the number of threads.&lt;/LI&gt;&lt;LI&gt;To ensure that your application has deterministic behavior with OpenMP* parallelization and does not adjust the number of threads dynamically at run time, set&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;MKL_DYNAMIC&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;and&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class=""&gt;OMP_DYNAMIC&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;to FALSE. This is especially needed if you are running your program on different systems.&lt;/LI&gt;&lt;LI&gt;If you are running your program with the Intel® Threading Building Blocks parallelization, numerical reproducibility is not guaranteed.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;More information on the oneMKL Developer Reference guide&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/current/conditional-numerical-reproducibility-control.html" target="_self"&gt;https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/current/conditional-numerical-reproducibility-control.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Jan 2024 23:07:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560025#M35691</guid>
      <dc:creator>George_Silva_Intel</dc:creator>
      <dc:date>2024-01-04T23:07:06Z</dc:date>
    </item>
    <item>
      <title>Re: MKL library produces different results depending on OpenMP thread count</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560428#M35692</link>
      <description>&lt;P&gt;OK.&amp;nbsp; Understood.&amp;nbsp; Thanks.&amp;nbsp; I'm a little surprised that cgetrs() would be a routine that would change dependent on thread count, but OK.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have tried to puzzle through the pages on CNR mode.&amp;nbsp; If I understand it right, in C I could invoke strict CNR with the call&lt;/P&gt;&lt;P&gt;mkl_cbwr_set(MKL_CBWR_STRICT);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is that correct?&amp;nbsp; If so, though, what I really need is something in Fortran.&amp;nbsp; Is there such a call in Fortran?&amp;nbsp; Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 05 Jan 2024 23:55:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560428#M35692</guid>
      <dc:creator>jethro_tull</dc:creator>
      <dc:date>2024-01-05T23:55:01Z</dc:date>
    </item>
    <item>
      <title>Re: MKL library produces different results depending on OpenMP thread count</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560432#M35693</link>
      <description>&lt;P&gt;Thank you for reaching out!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the Developer Reference Guide for Fortran on the CNR topic:&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-fortran/2024-0/conditional-numerical-reproducibility-control.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-fortran/2024-0/conditional-numerical-reproducibility-control.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;/P&gt;&lt;P&gt;__&lt;/P&gt;&lt;P&gt;George Silva&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 06 Jan 2024 01:51:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560432#M35693</guid>
      <dc:creator>George_Silva_Intel</dc:creator>
      <dc:date>2024-01-06T01:51:53Z</dc:date>
    </item>
    <item>
      <title>Re: MKL library produces different results depending on OpenMP thread count</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560854#M35704</link>
      <description>&lt;P&gt;George,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your help.&amp;nbsp; After looking through the site, I have tried adding the following to my code:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;integer :: mkl_status, cbwr_branch
cbwr_branch = mkl_cbwr_get_auto_branch()
mkl_status = mkl_cbwr_ser(cbwr_branch)&lt;/LI-CODE&gt;&lt;P&gt;This code seemed to run without issue but it did not result in any difference in output.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried adding a fourth line afterwards:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;mkl_status = mkl_cbwr_set(MKL_CBWR_STRICT)&lt;/LI-CODE&gt;&lt;P&gt;After this call, mkl_status was set to MKL_CBWR_ERR_INVALID_INPUT.&amp;nbsp; And the output was still the same.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any ideas what I did wrong?&amp;nbsp; Thanks again!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jan 2024 14:50:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/MKL-library-produces-different-results-depending-on-OpenMP/m-p/1560854#M35704</guid>
      <dc:creator>jethro_tull</dc:creator>
      <dc:date>2024-01-08T14:50:16Z</dc:date>
    </item>
  </channel>
</rss>

