<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Lin, in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Why-different-thread-num-makes-no-different-in-performance/m-p/1179090#M29186</link>
    <description>&lt;P&gt;Lin,&lt;/P&gt;&lt;P&gt;MKL internally parallelizes using OpenMP. If you are using a threading library, you need to turn off MKL threading.&amp;nbsp;-- Read this article:&amp;nbsp;https://software.intel.com/en-us/articles/using-threaded-intel-mkl-in-multi-thread-application&lt;/P&gt;&lt;P&gt;If that does not help, tell me what link and compile lines are you using?&lt;/P&gt;&lt;P&gt;By the way, Intel MKL does&amp;nbsp;not yet support Visual Studio 2019. Though I would not expect that&amp;nbsp;to cause this kind of performance issue.&lt;/P&gt;&lt;P&gt;Pamela&lt;/P&gt;</description>
    <pubDate>Fri, 18 Oct 2019 18:59:51 GMT</pubDate>
    <dc:creator>Pamela_H_Intel</dc:creator>
    <dc:date>2019-10-18T18:59:51Z</dc:date>
    <item>
      <title>Why different thread num makes no different in performance?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Why-different-thread-num-makes-no-different-in-performance/m-p/1179089#M29185</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;I'm testing MKL using VisualStudio 2019 and MKL v2019.5 on Intel i7-9750H CPU with 6 cores and 12 threads.I'm interested in the time consumed of vector mathematics and FFT functions in MKL.As I understand it, as to these two categories of functions,the time consumed should decrease when max theads num increases.But it did'nt happen to vector mathematics functions.I have tested vcMul and vcAdd function.The time consumed just makes no much different between thread num setting to 1 and 6.It's werid to me and I can't figure out a reason for it.Can anyone help me about it?The code is attached below,thanks very much!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;////////////////////////////////&lt;/P&gt;&lt;P&gt;int N = 16384;&lt;BR /&gt;int M = 2000;&lt;/P&gt;&lt;P&gt;//#define FFTTEST&amp;nbsp;&lt;BR /&gt;#define CMULTEST&amp;nbsp;&lt;BR /&gt;int main(void)&lt;BR /&gt;{&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;double clkfreq = mkl_get_clocks_frequency();&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unsigned MKL_INT64 startclk, endclk;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;double time;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;double time2[16384];&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;int kk = 0;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;/* Execution status */&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_LONG status = 0;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;DFTI_DESCRIPTOR_HANDLE hand = 0;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;//mkl_set_dynamic(0);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;//mkl_set_num_threads(1);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;int threadnum = mkl_get_max_threads();&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("设置线程数：%d\n", threadnum);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("FFT点数：%d &amp;nbsp;FFT次数：%d\n", N,M);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;/* Pointer to input/output data */&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_Complex8* x = 0;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_Complex8* y = 0;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;y = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_Complex8* x2 = 0;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_Complex8* y2 = 0;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x2 = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;y2 = (MKL_Complex8*)mkl_malloc(N * M * sizeof(MKL_Complex8), 64);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;if (x == NULL) goto failed;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;init2(x, x2);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;vmlSetMode(VML_EP);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mkl_get_cpu_clocks(&amp;amp;startclk);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (kk = 0; kk &amp;lt; M; kk++)&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;{&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;vcAdd(N, &amp;amp;x[N*kk], &amp;amp;x2[N * kk], &amp;amp;y[N * kk]);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;}&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mkl_get_cpu_clocks(&amp;amp;endclk);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;time = (double)(endclk - startclk) / (clkfreq * 1e9) * 1e6 / M;&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("复乘： %f us\n", time);&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mkl_free(x);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mkl_free(y);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mkl_free(x2);&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mkl_free(y2);&lt;/P&gt;&lt;P&gt;failed:&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;return 0;&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Oct 2019 10:02:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Why-different-thread-num-makes-no-different-in-performance/m-p/1179089#M29185</guid>
      <dc:creator>Yan__Lin</dc:creator>
      <dc:date>2019-10-14T10:02:11Z</dc:date>
    </item>
    <item>
      <title>Lin,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Why-different-thread-num-makes-no-different-in-performance/m-p/1179090#M29186</link>
      <description>&lt;P&gt;Lin,&lt;/P&gt;&lt;P&gt;MKL internally parallelizes using OpenMP. If you are using a threading library, you need to turn off MKL threading.&amp;nbsp;-- Read this article:&amp;nbsp;https://software.intel.com/en-us/articles/using-threaded-intel-mkl-in-multi-thread-application&lt;/P&gt;&lt;P&gt;If that does not help, tell me what link and compile lines are you using?&lt;/P&gt;&lt;P&gt;By the way, Intel MKL does&amp;nbsp;not yet support Visual Studio 2019. Though I would not expect that&amp;nbsp;to cause this kind of performance issue.&lt;/P&gt;&lt;P&gt;Pamela&lt;/P&gt;</description>
      <pubDate>Fri, 18 Oct 2019 18:59:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Why-different-thread-num-makes-no-different-in-performance/m-p/1179090#M29186</guid>
      <dc:creator>Pamela_H_Intel</dc:creator>
      <dc:date>2019-10-18T18:59:51Z</dc:date>
    </item>
  </channel>
</rss>

