<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic QR algorithm scalability problems in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/QR-algorithm-scalability-problems/m-p/1066788#M21960</link>
    <description>&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Dear support team!&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;We’ve faced some problems with QR algorithm scalability, implemented using MKL functions &lt;STRONG&gt;LAPACKE_sgehrd&lt;/STRONG&gt; to reduce our matrix to Hessenberg form and &lt;STRONG&gt;LAPACKE_shseqr&lt;/STRONG&gt; to perform iterations of QR algorithm itself.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Here is the code we launched on Xeon E5 v3 processor with 14 cores:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp;&lt;EM&gt; omp_set_num_threads(threads_count);&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; cout &amp;lt;&amp;lt; "threads count: " &amp;lt;&amp;lt; omp_get_max_threads() &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; double t1 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; LAPACKE_sgehrd(LAPACK_ROW_MAJOR, size, 1, size, A, size, tau);&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; double t2 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; cout &amp;lt;&amp;lt; "LAPACKE_sgehrd time: " &amp;lt;&amp;lt; t2 - t1 &amp;lt;&amp;lt; " sec" &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; float *re = new float[size];&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; float *im = new float[size];&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; float *z;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; t1 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; LAPACKE_shseqr(LAPACK_ROW_MAJOR, 'E', 'N', size, 1, size, A, size, re, im, z, size);&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; t2 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; cout &amp;lt;&amp;lt; "LAPACKE_shseqr time: " &amp;lt;&amp;lt; t2 - t1 &amp;lt;&amp;lt; " sec" &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;The compiler we used is icc (ICC) 15.0.3 20150407. Here are the results of launches on 1, 2, 3, 4 and 14 cores:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 1&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 84.4017 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 30.4593 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 2&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 45.2026 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 27.8578 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 3&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 35.0818 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 25.2905 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 4&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 27.3022 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 28.1272 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 14&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 19.8118 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 27.1131 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;As it is clear, &lt;STRONG&gt;LAPACKE_sgehrd&lt;/STRONG&gt; has poor scalability, while &lt;STRONG&gt;LAPACKE_shseqr&lt;/STRONG&gt; has no scalability at all. The question is if there is any way we can improve the scalability of both this routines, or it its working as intended?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Sincerely,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Vladislav Shishvatov&lt;/SPAN&gt;&lt;/P&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
    <pubDate>Wed, 26 Oct 2016 20:46:21 GMT</pubDate>
    <dc:creator>Vladislav_S_</dc:creator>
    <dc:date>2016-10-26T20:46:21Z</dc:date>
    <item>
      <title>QR algorithm scalability problems</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/QR-algorithm-scalability-problems/m-p/1066788#M21960</link>
      <description>&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Dear support team!&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;We’ve faced some problems with QR algorithm scalability, implemented using MKL functions &lt;STRONG&gt;LAPACKE_sgehrd&lt;/STRONG&gt; to reduce our matrix to Hessenberg form and &lt;STRONG&gt;LAPACKE_shseqr&lt;/STRONG&gt; to perform iterations of QR algorithm itself.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Here is the code we launched on Xeon E5 v3 processor with 14 cores:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp;&lt;EM&gt; omp_set_num_threads(threads_count);&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; cout &amp;lt;&amp;lt; "threads count: " &amp;lt;&amp;lt; omp_get_max_threads() &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; double t1 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; LAPACKE_sgehrd(LAPACK_ROW_MAJOR, size, 1, size, A, size, tau);&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; double t2 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; cout &amp;lt;&amp;lt; "LAPACKE_sgehrd time: " &amp;lt;&amp;lt; t2 - t1 &amp;lt;&amp;lt; " sec" &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; float *re = new float[size];&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; float *im = new float[size];&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; float *z;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; t1 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; LAPACKE_shseqr(LAPACK_ROW_MAJOR, 'E', 'N', size, 1, size, A, size, re, im, z, size);&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; t2 = omp_get_wtime();&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;&amp;nbsp; &amp;nbsp; cout &amp;lt;&amp;lt; "LAPACKE_shseqr time: " &amp;lt;&amp;lt; t2 - t1 &amp;lt;&amp;lt; " sec" &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;The compiler we used is icc (ICC) 15.0.3 20150407. Here are the results of launches on 1, 2, 3, 4 and 14 cores:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: Helvetica; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 13px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 1&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 84.4017 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 30.4593 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 2&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 45.2026 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 27.8578 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 3&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 35.0818 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 25.2905 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 4&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 27.3022 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 28.1272 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;threads count: 14&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_sgehrd time: 19.8118 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;EM&gt;&lt;SPAN style="font-kerning: none"&gt;LAPACKE_shseqr time: 27.1131 sec&lt;/SPAN&gt;&lt;/EM&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial; min-height: 12px;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;As it is clear, &lt;STRONG&gt;LAPACKE_sgehrd&lt;/STRONG&gt; has poor scalability, while &lt;STRONG&gt;LAPACKE_shseqr&lt;/STRONG&gt; has no scalability at all. The question is if there is any way we can improve the scalability of both this routines, or it its working as intended?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Sincerely,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-size: 11px; line-height: normal; font-family: 'Helvetica Neue'; color: rgb(0, 0, 0); -webkit-text-stroke-color: rgb(0, 0, 0); -webkit-text-stroke-width: initial;"&gt;&lt;SPAN style="font-kerning: none"&gt;Vladislav Shishvatov&lt;/SPAN&gt;&lt;/P&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Wed, 26 Oct 2016 20:46:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/QR-algorithm-scalability-problems/m-p/1066788#M21960</guid>
      <dc:creator>Vladislav_S_</dc:creator>
      <dc:date>2016-10-26T20:46:21Z</dc:date>
    </item>
    <item>
      <title>Hi Vladislav,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/QR-algorithm-scalability-problems/m-p/1066789#M21961</link>
      <description>&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;Hi&amp;nbsp;Vladislav,&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;I find something might be wrong in your code. The input general matrix A will be reduced to&amp;nbsp;upper Hessenberg form, and the A will be covered. That means, in your code, since the second time, you are not calculating same matrix as the first time. If the input data has been changed, it is not suitable to compare performance. I wrote a test case fixed this problem, you could have a test with it.&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;The structure of input matrix, the size of matrix would affect the performance of calculation. And sometimes, it is very common the performance of 16 threads is not better than 10 threads. For some calculation, the peak of performance is not the maximum number of threads. Normally, multiple threads is better than single thread, for some cases, 4 threads better than 2 threads. But above 6 or 8 threads, if you are not doing huge amount of calculation, the performance might be reduces. Because there's also cost for arranging works for threads, if the calculation cost is smaller than cost for arranging threads &amp;amp; initialization, the probably need to reduce the number of threads &amp;amp; choose proper number of threads to run your code.&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;I used icc 2017 &amp;amp; mkl 2017 in Xeon(R) CPU E5-2680&amp;nbsp;Linux Ubuntu&amp;nbsp;with 8 cpus (16 cores), general matrix A(2000*2000) with random values:&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;Requesting Intel(R) MKL to use 1 thread(s) --millisecond&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 1 thread(s) performance of ?gehrd is: 1.19741&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 1 thread(s) performance of ?hesqr is: 0.01532&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;Requesting Intel(R) MKL to use 2 thread(s)&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 2 thread(s) performance of ?gehrd is: 0.71143&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 2 thread(s) performance of ?hesqr is: 0.00992&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;Requesting Intel(R) MKL to use 4 thread(s)&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 4 thread(s) performance of ?gehrd is: 0.53160&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 4 thread(s) performance of ?hesqr is: 0.00657&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;Requesting Intel(R) MKL to use 10 thread(s) (peak of performance)&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 10 thread(s) performance of ?gehrd is: 0.34154&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px;"&gt;The 10 thread(s) performance of ?hesqr is: 0.00393&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="word-wrap: break-word; margin-right: 10px; font-size: 13.008px;"&gt;more than 10 threads, the performance would decrease.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2016 08:07:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/QR-algorithm-scalability-problems/m-p/1066789#M21961</guid>
      <dc:creator>Zhen_Z_Intel</dc:creator>
      <dc:date>2016-10-27T08:07:03Z</dc:date>
    </item>
  </channel>
</rss>

