<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Possible Issue with xORMLQ function in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Possible-Issue-with-xORMLQ-function/m-p/1137767#M26149</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I've noticed that when I use the xORMLQ (as part of an LQ solution of an underdetermined system), the xORMLQ function is accessing parts of the right-hand-side matrix that I do not think it should be.&amp;nbsp; I've attached a small test case (double precision, real) exhibiting the behavior.&lt;/P&gt;&lt;P&gt;The test case attempts to solve the matrix A*x = b where A is underdetermined.&amp;nbsp; In the example, A is 30 by 36.&amp;nbsp; After the LQ factorization (which seems correct), you solve the problem as&amp;nbsp;&amp;nbsp; x = Q^T * inv(L) * b where b is of length 30 and x is of length 36.&amp;nbsp; In the test case, x and b are the same array with total length 36.&amp;nbsp; The step inv(L)*b is performed using a call to TRSM and the result is correct (of length 30).&amp;nbsp; The application of Q^T is performed with a call to ORMLQ.&amp;nbsp; In this case the input vector (called 'C' in the function) is of length 30 and the output is of length 36.&amp;nbsp; However, the ORMLQ actually is dependent on the input values of the array C(31:36).&amp;nbsp; If you do not pre-zero these values, the total result is incorrect.&amp;nbsp; If you do pre-zero these values, the result is correct.&amp;nbsp; For the non-blocked code path, the actual issue is in a GEMV called by DLARF (called by ORMLQ). This GEMV multiply is including these extra values of the C-array.&amp;nbsp; In line 100-101 of the test code, you can toggle between zeroing the extra entries or stuffing them with garbage values.&lt;/P&gt;&lt;P&gt;Nothing in the documentation indicates that you need to zero these unused values on input to the ORMLQ function.&amp;nbsp; To me it seems like undesired behavior to require the user to pre-zero these values.&amp;nbsp; I feel that LAPACK should zero them if necessary in preparation for the GEMV calls (which might actually need to access them as C is computed).&amp;nbsp; Or, the documentation needs to change to indicate that the user is required to pre-zero these extra array entries.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I ran the test case on Windows 7 with MKL 2019.0.2 and compile line of 'ifort /Qmkl lqbug.f90'.&amp;nbsp; Note that this issue also exists in the stock version of LAPACK available from netlib.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;John&lt;/P&gt;</description>
    <pubDate>Wed, 01 May 2019 16:16:02 GMT</pubDate>
    <dc:creator>John_Young</dc:creator>
    <dc:date>2019-05-01T16:16:02Z</dc:date>
    <item>
      <title>Possible Issue with xORMLQ function</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Possible-Issue-with-xORMLQ-function/m-p/1137767#M26149</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I've noticed that when I use the xORMLQ (as part of an LQ solution of an underdetermined system), the xORMLQ function is accessing parts of the right-hand-side matrix that I do not think it should be.&amp;nbsp; I've attached a small test case (double precision, real) exhibiting the behavior.&lt;/P&gt;&lt;P&gt;The test case attempts to solve the matrix A*x = b where A is underdetermined.&amp;nbsp; In the example, A is 30 by 36.&amp;nbsp; After the LQ factorization (which seems correct), you solve the problem as&amp;nbsp;&amp;nbsp; x = Q^T * inv(L) * b where b is of length 30 and x is of length 36.&amp;nbsp; In the test case, x and b are the same array with total length 36.&amp;nbsp; The step inv(L)*b is performed using a call to TRSM and the result is correct (of length 30).&amp;nbsp; The application of Q^T is performed with a call to ORMLQ.&amp;nbsp; In this case the input vector (called 'C' in the function) is of length 30 and the output is of length 36.&amp;nbsp; However, the ORMLQ actually is dependent on the input values of the array C(31:36).&amp;nbsp; If you do not pre-zero these values, the total result is incorrect.&amp;nbsp; If you do pre-zero these values, the result is correct.&amp;nbsp; For the non-blocked code path, the actual issue is in a GEMV called by DLARF (called by ORMLQ). This GEMV multiply is including these extra values of the C-array.&amp;nbsp; In line 100-101 of the test code, you can toggle between zeroing the extra entries or stuffing them with garbage values.&lt;/P&gt;&lt;P&gt;Nothing in the documentation indicates that you need to zero these unused values on input to the ORMLQ function.&amp;nbsp; To me it seems like undesired behavior to require the user to pre-zero these values.&amp;nbsp; I feel that LAPACK should zero them if necessary in preparation for the GEMV calls (which might actually need to access them as C is computed).&amp;nbsp; Or, the documentation needs to change to indicate that the user is required to pre-zero these extra array entries.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I ran the test case on Windows 7 with MKL 2019.0.2 and compile line of 'ifort /Qmkl lqbug.f90'.&amp;nbsp; Note that this issue also exists in the stock version of LAPACK available from netlib.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;John&lt;/P&gt;</description>
      <pubDate>Wed, 01 May 2019 16:16:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Possible-Issue-with-xORMLQ-function/m-p/1137767#M26149</guid>
      <dc:creator>John_Young</dc:creator>
      <dc:date>2019-05-01T16:16:02Z</dc:date>
    </item>
  </channel>
</rss>

