<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using OpenMP and DGEMM in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879977#M9420</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
I guess you would set OMP_NESTED, or terminate your PARALLEL before DGEMM. and link the mkl_thread library if you intend DGEMM to start its own threads. If the DGEMM invocations are separate and about equal in time consumption, you could put them in separate OMP SECTIONs (a usage I haven't seen). If you don't need DGEMM to be in your single parallel region, I doubt you would lose by 2 separate parallel regions, allowing DGEMM to use the team of threads which persists from your first parallel loop, and then your 2nd parallel region would take back the same thread team.&lt;BR /&gt;You could learn more and help us give advice if you would link the OpenMP profiling library and show the profiling result.&lt;BR /&gt;</description>
    <pubDate>Wed, 02 Sep 2009 17:57:34 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2009-09-02T17:57:34Z</dc:date>
    <item>
      <title>Using OpenMP and DGEMM</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879976#M9419</link>
      <description>Hello,&lt;BR /&gt;&lt;BR /&gt;I am currently trying to parallelize a time-dependent (FORTRAN) code that basically consists of several loops and DGEMM calls, e.g:&lt;BR /&gt;&lt;BR /&gt;DO time=1,endtime&lt;BR /&gt;&lt;BR /&gt;DO i=1,end&lt;BR /&gt; (calculations)&lt;BR /&gt;END DO&lt;BR /&gt;&lt;BR /&gt;CALL DGEMM ( )&lt;BR /&gt;CALL DGEMM ( )&lt;BR /&gt;&lt;BR /&gt;DO i=1,end&lt;BR /&gt; (calculations)&lt;BR /&gt; END DO&lt;BR /&gt;&lt;BR /&gt;END DO&lt;BR /&gt;&lt;BR /&gt;I am wondering if someone can offer some advice on how to parallelize this piece of code that will make the most usage of the parallalization that is already built in to the matrix multiply routines (DGEMM). Essentially, I would like to do something like this:&lt;BR /&gt;&lt;BR /&gt;DO time=1,endtime&lt;BR /&gt;&lt;BR /&gt;!$OMP PARALLEL&lt;BR /&gt;&lt;BR /&gt;!$OMP DO&lt;BR /&gt;DO i=1,end&lt;BR /&gt; (calculations)&lt;BR /&gt; END DO&lt;BR /&gt;!$OMP END DO&lt;BR /&gt; &lt;BR /&gt;&lt;BR /&gt; CALL DGEMM ( )&lt;BR /&gt; CALL DGEMM ( )&lt;BR /&gt; &lt;BR /&gt; !$OMP DO&lt;BR /&gt; DO i=1,end&lt;BR /&gt; (calculations)&lt;BR /&gt; END DO&lt;BR /&gt; !$OMP END DO&lt;BR /&gt;&lt;BR /&gt;!$OMP END PARALLEL&lt;BR /&gt;&lt;BR /&gt;END DO&lt;BR /&gt;&lt;BR /&gt;However, I am not certain on what to do with the section of code that contains the DGEMM calls in terms of OpenMP directives. Should I just have one thread execute this section, or is there a better way to exploit the parallelism of the DGEMM routines within OpenMP. Does anyone have some advice on this?&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;&lt;BR /&gt;Mandrew</description>
      <pubDate>Wed, 02 Sep 2009 17:06:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879976#M9419</guid>
      <dc:creator>mandrew</dc:creator>
      <dc:date>2009-09-02T17:06:53Z</dc:date>
    </item>
    <item>
      <title>Re: Using OpenMP and DGEMM</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879977#M9420</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
I guess you would set OMP_NESTED, or terminate your PARALLEL before DGEMM. and link the mkl_thread library if you intend DGEMM to start its own threads. If the DGEMM invocations are separate and about equal in time consumption, you could put them in separate OMP SECTIONs (a usage I haven't seen). If you don't need DGEMM to be in your single parallel region, I doubt you would lose by 2 separate parallel regions, allowing DGEMM to use the team of threads which persists from your first parallel loop, and then your 2nd parallel region would take back the same thread team.&lt;BR /&gt;You could learn more and help us give advice if you would link the OpenMP profiling library and show the profiling result.&lt;BR /&gt;</description>
      <pubDate>Wed, 02 Sep 2009 17:57:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879977#M9420</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-09-02T17:57:34Z</dc:date>
    </item>
    <item>
      <title>Re: Using OpenMP and DGEMM</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879978#M9421</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/367365"&gt;tim18&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; I guess you would set OMP_NESTED, or terminate your PARALLEL before DGEMM. and link the mkl_thread library if you intend DGEMM to start its own threads. If the DGEMM invocations are separate and about equal in time consumption, you could put them in separate OMP SECTIONs (a usage I haven't seen). If you don't need DGEMM to be in your single parallel region, I doubt you would lose by 2 separate parallel regions, allowing DGEMM to use the team of threads which persists from your first parallel loop, and then your 2nd parallel region would take back the same thread team.&lt;BR /&gt;You could learn more and help us give advice if you would link the OpenMP profiling library and show the profiling result.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Thanks for the advice. I am not familiar with the OpenMP profiling library - is this discussed in the intel compiler documentation?&lt;BR /&gt;</description>
      <pubDate>Wed, 02 Sep 2009 18:53:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879978#M9421</guid>
      <dc:creator>mandrew</dc:creator>
      <dc:date>2009-09-02T18:53:24Z</dc:date>
    </item>
    <item>
      <title>Re: Using OpenMP and DGEMM</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879979#M9422</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
You should find the openmp-profile link option mentioned in Intel compiler docs. Beyond that, I don't find the documentation adequate.&lt;BR /&gt;On linux, if you did a default .so link to the OpenMP library, you can use LD_PRELOAD to substitute the profiling library without re-linking.&lt;BR /&gt;If you simply run normally with the profiling library, it writes to a file named guide.gvs, which you can read with a text reader or plot in Windows VTune. It displays performance statistics on each parallel region in code compiled by Intel compiler or in MKL.&lt;BR /&gt;All this is said to be subject to change next year, in case that may explain sketchy documentation.&lt;BR /&gt;</description>
      <pubDate>Wed, 02 Sep 2009 22:20:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Using-OpenMP-and-DGEMM/m-p/879979#M9422</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-09-02T22:20:44Z</dc:date>
    </item>
  </channel>
</rss>

