<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic In the usual sense, time in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010272#M19210</link>
    <description>&lt;P&gt;In the usual sense, time stepping would be inherently sequential and would be outside parallel regions. &amp;nbsp;Of course, we.don't know if that is a red herring, but it also looks like you are setting up for each thread to evaluate termination condition in a racy manner.&lt;/P&gt;</description>
    <pubDate>Mon, 20 Jul 2015 12:07:47 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2015-07-20T12:07:47Z</dc:date>
    <item>
      <title>Calling LAPACK/MKL from parallel OpenMP region</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010270#M19208</link>
      <description>&lt;P&gt;Dear All,&lt;/P&gt;

&lt;P&gt;I often call some BLAS and LAPACK (MKL) routines from my Fortran programs. Typically, I try to place these calls outside of any parallel OpenMP regions while then making use of the parallelism inside the routines.&lt;/P&gt;

&lt;P&gt;In a new part of the code, however, I need to call DGESV from a "omp parallel" region (see dummy code below). The below code will crash as all threads call GESV individually. Putting the GESV call into a "omp single" section works, but limits the performance, as it is running in one thread only.&lt;/P&gt;

&lt;P&gt;I am aware that I could "easily" end the parallel region, call LAPACK/MKL and start a new parallel region again. Still, this doesn't feel "right" to me, particularly as I pass this part of the code many, many times as I am solving a physical problem iteratively.&lt;/P&gt;

&lt;P&gt;Chances are, that there is a smarter way of doing the OpenMP part anyways, but a very similar code using BLAS's GEMV instead of LAPACK's GESV works beautifully and scales perfectly. (Note that in that case I call GEMV from a "omp single" region.)&lt;/P&gt;

&lt;P&gt;Any comments are appreciated, also those that suggest other ways of structuring the OpenMP part of below code, or those that give some more general hints on how to call BLAS/LAPACK/MKL from OpenMP programs properly and in general.&lt;/P&gt;

&lt;P&gt;Best,&lt;/P&gt;

&lt;P&gt;Pelle&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Simplified code:&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;!$omp parallel
timestepping: do while (t &amp;lt; tmax .and. .not. error)
     !$omp do
     do i=1, imax
          ! do stuff, prepare system of equations
     end do
     !$omp end do
     
     ! Need to improve here
     call gesv(lhs, rhs, info=ierror)
     
     ! Other operations, check for convergence/errors etc
     !$omp barrier
end do timestepping
! Do some other stuff in parallel
!$omp end parallel&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jul 2015 18:30:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010270#M19208</guid>
      <dc:creator>Pelle_R_</dc:creator>
      <dc:date>2015-07-14T18:30:02Z</dc:date>
    </item>
    <item>
      <title>Hi, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010271#M19209</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I recall &amp;nbsp;a similar discussion for your reference&lt;/P&gt;

&lt;P&gt;&lt;A href="https://software.intel.com/en-us/forums/topic/550681" target="_blank"&gt;https://software.intel.com/en-us/forums/topic/550681&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;You can control the MKL gesv thread in OpenMP parallel region by adding&lt;/P&gt;

&lt;P&gt;export KMP_AFFINITY="verbose,compact"&lt;BR /&gt;
	export OMP_ACTIVE_LEVELS="2"&lt;BR /&gt;
	export OMP_NESTED="true"&lt;BR /&gt;
	export MKL_DYNAMIC="false"&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;call mkl_set_num_threads( nt)&lt;/P&gt;

&lt;P&gt;&lt;CODE class="keyword" style="font-size: 13.0080003738403px; line-height: 14.3088006973267px; font-family: Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace !important; margin: 0px !important; padding: 0px !important; border: 0px !important; outline: 0px !important; float: none !important; vertical-align: baseline !important; position: static !important; left: auto !important; top: auto !important; right: auto !important; bottom: auto !important; height: auto !important; width: auto !important; font-weight: bold !important; min-height: inherit !important; color: rgb(0, 102, 153) !important; background-image: none !important; background-attachment: initial !important; background-color: rgb(248, 248, 248); background-size: initial !important; background-origin: initial !important; background-clip: initial !important; background-position: initial !important; background-repeat: initial !important;"&gt;call&lt;/CODE&gt;&lt;SPAN style="color: rgb(0, 0, 0); font-family: Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace; font-size: 13.0080003738403px; line-height: 14.3088006973267px; background-color: rgb(248, 248, 248);"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;CODE class="plain" style="font-size: 13.0080003738403px; line-height: 14.3088006973267px; color: rgb(0, 0, 0); font-family: Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace !important; margin: 0px !important; padding: 0px !important; border: 0px !important; outline: 0px !important; float: none !important; vertical-align: baseline !important; position: static !important; left: auto !important; top: auto !important; right: auto !important; bottom: auto !important; height: auto !important; width: auto !important; min-height: inherit !important; background-image: none !important; background-attachment: initial !important; background-color: rgb(248, 248, 248); background-size: initial !important; background-origin: initial !important; background-clip: initial !important; background-position: initial !important; background-repeat: initial !important;"&gt;gesv(lhs, rhs, info=ierror).&amp;nbsp;&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;Please let us know the result if any experiments&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2015 03:34:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010271#M19209</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2015-07-20T03:34:54Z</dc:date>
    </item>
    <item>
      <title>In the usual sense, time</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010272#M19210</link>
      <description>&lt;P&gt;In the usual sense, time stepping would be inherently sequential and would be outside parallel regions. &amp;nbsp;Of course, we.don't know if that is a red herring, but it also looks like you are setting up for each thread to evaluate termination condition in a racy manner.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jul 2015 12:07:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Calling-LAPACK-MKL-from-parallel-OpenMP-region/m-p/1010272#M19210</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2015-07-20T12:07:47Z</dc:date>
    </item>
  </channel>
</rss>

