<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hello,  in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037068#M20472</link>
    <description>&lt;P&gt;Hello,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Could you provide some details on this? &amp;nbsp;How many threads are you using to run MKL functions? &amp;nbsp; What is the hardware platform? &amp;nbsp;Also how does the sgemm is called there? &amp;nbsp;Is it just a simple dgemm or &amp;nbsp;it is called in some loops?&amp;nbsp;&lt;BR /&gt;
	When I run some simple code here. I do not see this problem.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Chao&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 17 Jun 2014 05:38:33 GMT</pubDate>
    <dc:creator>Chao_Y_Intel</dc:creator>
    <dc:date>2014-06-17T05:38:33Z</dc:date>
    <item>
      <title>libiomp5md.dll location (release build)</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037065#M20469</link>
      <description>&lt;P&gt;Hello, where would I find the dll without tracing code? Currently I use one from MKL 11.1.2 (64-bit, version 5.0.2013.1126, file size 1043kB, modified 2014-01-31 12:23) and the profiler shows this (judging by OpenMP source I found, I assume that __kmp_print_storage_map_gtid is some printfoid tracing function, which eats 69.31% of the time): Inclusive Samples % Function Name 100.00 cs.exe 99.25 - RtlUserThreadStart 99.25 -- BaseThreadInitThunk 80.50 --- __kmp_launch_worker(void *) 80.50 ---- __kmp_launch_thread 69.32 ----- __kmp_fork_barrier(int,int) 69.31 ------ __kmp_print_storage_map_gtid 11.16 ----- __kmp_invoke_task_func 11.15 ------ __kmp_invoke_microtask 10.85 ------- mkl_blas_dgemm 0.19 ------- mkl_lapack_dlasr3 0.05 ------- etc... 18.76 __tmainCRTStartup 18.76 - AfxWinMain(struct HINSTANCE__ *,struct HINSTANCE__ *,char *,int) I did find: C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64\compiler C:\Program Files (x86)\Intel\Composer XE 2013 SP1\redist\intel64\compiler but they are the wrong ones.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 16:18:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037065#M20469</guid>
      <dc:creator>Torsten_H_</dc:creator>
      <dc:date>2014-06-10T16:18:20Z</dc:date>
    </item>
    <item>
      <title>Hello,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037066#M20470</link>
      <description>&lt;P&gt;&lt;SPAN lang="EN" style="color: rgb(83, 87, 94); font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 9.5pt; mso-ansi-language: EN;"&gt;Hello, &lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="EN" style="color: rgb(83, 87, 94); font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 9.5pt; mso-ansi-language: EN;"&gt;The libiomp5md.dll file in the compiler redistribution folder is the right one you can use.&amp;nbsp; If you see some performance issue report, can you post&amp;nbsp;one sample code that may help to have further check? &lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="EN" style="color: rgb(83, 87, 94); font-family: &amp;quot;Arial&amp;quot;,&amp;quot;sans-serif&amp;quot;; font-size: 9.5pt; mso-ansi-language: EN;"&gt;Thanks,&lt;BR /&gt;
	Chao&lt;P&gt;&lt;/P&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jun 2014 05:51:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037066#M20470</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2014-06-13T05:51:45Z</dc:date>
    </item>
    <item>
      <title>(Sorry about the disappearing</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037067#M20471</link>
      <description>&lt;P&gt;(Sorry about the disappearing newlines)&lt;/P&gt;

&lt;P&gt;I've diffed the files and they are identical:&lt;/P&gt;

&lt;P&gt;C:\Windows\System32&amp;gt;fc "C:\Program Files (x86)\Intel\Composer XE 2013 SP1\redist\intel64\compiler\libiomp5md.dll" "&lt;BR /&gt;
	C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64\compiler\libiomp5md.dll"&lt;BR /&gt;
	Comparing files C:\PROGRAM FILES (X86)\INTEL\COMPOSER XE 2013 SP1\REDIST\INTEL64\COMPILER\libiomp5md.dll and C:\PRO&lt;BR /&gt;
	GRAM FILES (X86)\COMMON FILES\INTEL\SHARED LIBRARIES\REDIST\INTEL64\COMPILER\LIBIOMP5MD.DLL&lt;BR /&gt;
	FC: no differences encountered&lt;/P&gt;

&lt;P&gt;As for the test case: Profile some largeish (1000 x 1000 matrices) dgemm-calls and that should produce something like the above.&lt;/P&gt;</description>
      <pubDate>Sat, 14 Jun 2014 20:39:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037067#M20471</guid>
      <dc:creator>Torsten_H_</dc:creator>
      <dc:date>2014-06-14T20:39:31Z</dc:date>
    </item>
    <item>
      <title>Hello, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037068#M20472</link>
      <description>&lt;P&gt;Hello,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Could you provide some details on this? &amp;nbsp;How many threads are you using to run MKL functions? &amp;nbsp; What is the hardware platform? &amp;nbsp;Also how does the sgemm is called there? &amp;nbsp;Is it just a simple dgemm or &amp;nbsp;it is called in some loops?&amp;nbsp;&lt;BR /&gt;
	When I run some simple code here. I do not see this problem.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Chao&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jun 2014 05:38:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037068#M20472</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2014-06-17T05:38:33Z</dc:date>
    </item>
    <item>
      <title>Hello Chao,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037069#M20473</link>
      <description>&lt;P&gt;Hello Chao,&lt;/P&gt;

&lt;P&gt;I misread the test case before - here's what it really does (and it's 8 threads on an i7):&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;for (int k = 0; k &amp;lt; 5000; ++k)&lt;BR /&gt;
		{&lt;BR /&gt;
		&amp;nbsp;&amp;nbsp;&amp;nbsp; v = some 999-element row vector;&lt;BR /&gt;
		&amp;nbsp;&amp;nbsp;&amp;nbsp; compute v' * v (via dgemm, result is a 999 x 999 matrix)&lt;BR /&gt;
		&amp;nbsp;&amp;nbsp;&amp;nbsp; int g = f(k); // g = 1,2 or 3&lt;BR /&gt;
		&amp;nbsp;&amp;nbsp;&amp;nbsp; add the result to some matrix M&lt;G&gt;;&lt;/G&gt;&lt;/P&gt;

	&lt;P&gt;}&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;I'll rewrite this to make the dgemm calls nontrivial, which should make the threading overhead disappear. However, I still think it's a problem that __kmp_print_storage_map_gtid appears at all.&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Torsten&lt;/P&gt;</description>
      <pubDate>Wed, 18 Jun 2014 07:44:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037069#M20473</guid>
      <dc:creator>Torsten_H_</dc:creator>
      <dc:date>2014-06-18T07:44:38Z</dc:date>
    </item>
    <item>
      <title>Now the test case went from</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037070#M20474</link>
      <description>&lt;P&gt;Now the test case went from 40 seconds down to 2 seconds (which is nice :-) but the profiler still shows 46.54% of the time being spent in __kmp_print_storage_map_gtid. With a different call stack though (it's doing some eigenvalue stuff now).&lt;/P&gt;

&lt;P&gt;Point being: A build of libiomp5md without tracing would still be nice.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Jun 2014 08:31:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/libiomp5md-dll-location-release-build/m-p/1037070#M20474</guid>
      <dc:creator>Torsten_H_</dc:creator>
      <dc:date>2014-06-18T08:31:08Z</dc:date>
    </item>
  </channel>
</rss>

