<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi Vineet, in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917262#M12701</link>
    <description>Hi Vineet,

&amp;gt;&amp;gt;...The answers received by using Intel Xeon Processors and Intel Composer XE 2011 Machine are more accurate [ &lt;STRONG&gt;SK:&lt;/STRONG&gt; On Linux ]
&amp;gt;&amp;gt;than what is obtained by using the same code on windows machine...

Please post command lines for both cases ( sorry, I don't want to make any suggestions before I see all used options ). Next, I'll be able to verify calculations &lt;STRONG&gt;only&lt;/STRONG&gt; on Windows 7 Professional with Intel Parallel Studio XE 2013 Update 2.

Also, would you be able to execute a couple of simple C/C++ tests ( I'll provide portable C/C++ codes ) to verify precision control functionality on both systems?</description>
    <pubDate>Tue, 02 Apr 2013 00:15:00 GMT</pubDate>
    <dc:creator>SergeyKostrov</dc:creator>
    <dc:date>2013-04-02T00:15:00Z</dc:date>
    <item>
      <title>matrix inversion precision issues on two different processors</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917261#M12700</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;
&lt;P&gt;I am having very serious precision issues by using intel mkl-lapack for matrix inversion:&lt;/P&gt;
&lt;P&gt;Steps:&lt;/P&gt;
&lt;P&gt;(1) I inverted a matrix using Matlab/Octave&lt;/P&gt;
&lt;P&gt;(2)&amp;nbsp;I use dgetrf and dgetri to invert the same matrix on two processors (a) Intel(R) Core(TM) i7-2600 CPU 3.40GHz for the test code/16GB of RAM on a windows machine using Intel Parallel Composer XE 2013, and (b) Intel(R) Xeon(R) CPU&amp;nbsp;X5660 2.80GHz on a linux machine by using Composer xe 2011&lt;/P&gt;
&lt;P&gt;(3) The problem is that the difference between the inverse obtained using Matlab/Octave and by using dgetrf and dgetri is different. There are differences &lt;STRONG&gt;is not an issue&lt;/STRONG&gt; but the differences are based on processors is creating problems in large simulations. The answers received by using Intel Xeon Processors and Intel Composer XE 2011 Machine are more accurate than what is obtained by using the same code on windows machine&lt;/P&gt;
&lt;P&gt;At this moment I think I am overlooking something i.e. creating a big mistake. An advice on solving this issue would be greatly appreciated. I have attached a sample code to highlight this issue. I have included the sample code but I was not able to upload input binary files on the forum (It was taking long long time)&lt;/P&gt;
&lt;P&gt;Many thanks&lt;/P&gt;
&lt;P&gt;Vineet&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 01 Apr 2013 20:36:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917261#M12700</guid>
      <dc:creator>Vineet_Y_</dc:creator>
      <dc:date>2013-04-01T20:36:23Z</dc:date>
    </item>
    <item>
      <title>Hi Vineet,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917262#M12701</link>
      <description>Hi Vineet,

&amp;gt;&amp;gt;...The answers received by using Intel Xeon Processors and Intel Composer XE 2011 Machine are more accurate [ &lt;STRONG&gt;SK:&lt;/STRONG&gt; On Linux ]
&amp;gt;&amp;gt;than what is obtained by using the same code on windows machine...

Please post command lines for both cases ( sorry, I don't want to make any suggestions before I see all used options ). Next, I'll be able to verify calculations &lt;STRONG&gt;only&lt;/STRONG&gt; on Windows 7 Professional with Intel Parallel Studio XE 2013 Update 2.

Also, would you be able to execute a couple of simple C/C++ tests ( I'll provide portable C/C++ codes ) to verify precision control functionality on both systems?</description>
      <pubDate>Tue, 02 Apr 2013 00:15:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917262#M12701</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-04-02T00:15:00Z</dc:date>
    </item>
    <item>
      <title> Here are the command lines</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917263#M12702</link>
      <description>&lt;P&gt;&amp;nbsp;Here are the command lines you requested. Send me the C/C++ files and I will execute them to verify precision control&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;For linux (Intel Xeon processor)&lt;/P&gt;
&lt;P&gt;ifort&amp;nbsp; source1.f90 –heap-arrays&amp;nbsp; -openmp -L /share/apps/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/ -I /share/apps/intel/composer_xe_2011_sp1.7.256/mkl/include/ -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 –lpthread –o source1.exe&lt;/P&gt;
&lt;P&gt;For Windows&lt;/P&gt;
&lt;P&gt;Compiling with Intel(R) Visual Fortran Compiler XE 13.1.0.149 [Intel(R) 64]...&lt;/P&gt;
&lt;P&gt;ifort /nologo /debug:full /O2 /I"C:\Program Files (x86)\Intel\Composer XE 2013\mkl\include" /warn:interfaces /module:"x64\Debug\\" /object:"x64\Debug\\" /Fd"x64\Debug\vc100.pdb" /traceback /check:none /libs:static /threads /dbglibs /Qmkl:parallel /c -heap-arrays /Qvc10 /Qlocation,link,"C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\\bin\amd64" "C:\Users\Vineet Work\Documents\Visual Studio 2010\Projects\source1.f90\source1.f90\Source1.f90"&lt;/P&gt;
&lt;P&gt;Linking...&lt;/P&gt;
&lt;P&gt;Link /OUT:"x64\Debug\source1.f90.exe" /INCREMENTAL:NO /NOLOGO /LIBPATH:"C:\Program Files (x86)\Intel\Composer XE 2013\mkl\lib\intel64" /MANIFEST /MANIFESTFILE:"C:\Users\Vineet Work\Documents\Visual Studio 2010\Projects\source1.f90\source1.f90\x64\Debug\source1.f90.exe.intermediate.manifest" /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG /PDB:"C:\Users\Vineet Work\Documents\Visual Studio 2010\Projects\source1.f90\source1.f90\x64\Debug\source1.f90.pdb" /SUBSYSTEM:CONSOLE /IMPLIB:"C:\Users\Vineet Work\Documents\Visual Studio 2010\Projects\source1.f90\source1.f90\x64\Debug\source1.f90.lib" mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib mkl_lapack95_lp64.lib "x64\Debug\Source1.obj" "x64\Debug\Source2.obj"&lt;/P&gt;</description>
      <pubDate>Tue, 02 Apr 2013 03:05:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917263#M12702</guid>
      <dc:creator>Vineet_Y_</dc:creator>
      <dc:date>2013-04-02T03:05:18Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;...Send me the C/C++ files</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917264#M12703</link>
      <description>&amp;gt;&amp;gt;...Send me the C/C++ files and I will execute them to verify precision control...

Here it is and there are two solutions ( VS 2008 ) for Intel and Microsoft C++ compilers.

Note: /Qlong-double /Qpc80 options is used for Intel C++ compiler</description>
      <pubDate>Tue, 02 Apr 2013 05:04:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917264#M12703</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-04-02T05:04:36Z</dc:date>
    </item>
    <item>
      <title>Outputs for Reference:</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917265#M12704</link>
      <description>Outputs for Reference:

&lt;STRONG&gt;[ Intel C++ compiler ( 16-byte long double data type is used (!) ) ]&lt;/STRONG&gt;

32-bit Windows platform - Configuration: RELEASE
Test-Case 1
&lt;STRONG&gt;Size of [ long double ] is: 16&lt;/STRONG&gt;
Test-Case 2
_CW_DEFAULT &amp;amp; ALLBITSON: 0x9001F
_PC_24 &amp;amp; _MCW_PC       : 0xA001F
_PC_53 &amp;amp; _MCW_PC       : 0x9001F
_PC_64 &amp;amp; _MCW_PC       : 0x8001F
Test-Case 3.1
Accuracy _CW_DEFAULT  - long double - Result: 1.0000000000079181
Sub-Test 3.2
Accuracy _PC_24       - long double - Result: 1.0090389251708984
Test-Case 3.3
Accuracy _PC_53       - long double - Result: 1.0000000000079181
&lt;STRONG&gt;Test-Case 3.4
Accuracy _PC_64       - long double - Result: 1.0000000000000109&lt;/STRONG&gt;
Test-Case 4

Matrix A
         101.0  201.0  301.0  401.0  501.0  601.0  701.0  801.0
         901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
        1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
        2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
        3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
        4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
        4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
        5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0

Matrix B
         101.0  201.0  301.0  401.0  501.0  601.0  701.0  801.0
         901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
        1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
        2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
        3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
        4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
        4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
        5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0

MFPT Used

Matrix C - Result
         13826808.0  14187608.0  14548408.0  14909208.0  15270008.0  15630808.0  15991608.0  16352408.0
         32393208.0  33394008.0  34394808.0  35395608.0  36396408.0  37397208.0  38398008.0  39398808.0
         50959608.0  52600408.0  54241208.0  55882008.0  57522808.0  59163608.0  60804408.0  62445208.0
         69526008.0  71806808.0  74087608.0  76368408.0  78649208.0  80930008.0  83210808.0  85491608.0
         88092408.0  91013208.0  93934008.0  96854808.0  99775608.0 102696408.0 105617208.0 108538008.0
        106658808.0 110219608.0 113780408.0 117341208.0 120902008.0 124462800.0 128023600.0 131584400.0
        125225200.0 129426000.0 133626800.0 137827600.0 142028400.0 146229200.0 150430000.0 154630800.0
        143791600.0 148632400.0 153473200.0 158314000.0 163154800.0 167995600.0 172836400.0 177677200.0

Press ESC to Exit...

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

&lt;STRONG&gt;[ Microsoft C++ compiler ]&lt;/STRONG&gt;

32-bit Windows platform - Configuration: RELEASE
Test-Case 1
&lt;STRONG&gt;Size of [ long double ] is: 8&lt;/STRONG&gt;
Test-Case 2
_CW_DEFAULT &amp;amp; ALLBITSON: 0x9001F
_PC_24 &amp;amp; _MCW_PC       : 0xA001F
_PC_53 &amp;amp; _MCW_PC       : 0x9001F
_PC_64 &amp;amp; _MCW_PC       : 0x8001F
Test-Case 3.1
Accuracy _CW_DEFAULT  - long double - Result: 1.0000000000079181
Sub-Test 3.2
Accuracy _PC_24       - long double - Result: 1.0090389251708984
Test-Case 3.3
Accuracy _PC_53       - long double - Result: 1.0000000000079181
&lt;STRONG&gt;Test-Case 3.4
Accuracy _PC_64       - long double - Result: 1.0000000000079181&lt;/STRONG&gt;
Test-Case 4

Matrix A
         101.0  201.0  301.0  401.0  501.0  601.0  701.0  801.0
         901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
        1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
        2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
        3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
        4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
        4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
        5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0

Matrix B
         101.0  201.0  301.0  401.0  501.0  601.0  701.0  801.0
         901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
        1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
        2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
        3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
        4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
        4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
        5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0

MFPT Used

Matrix C - Result
         13826808.0  14187608.0  14548408.0  14909208.0  15270008.0  15630808.0  15991608.0  16352408.0
         32393208.0  33394008.0  34394808.0  35395608.0  36396408.0  37397208.0  38398008.0  39398808.0
         50959608.0  52600408.0  54241208.0  55882008.0  57522808.0  59163608.0  60804408.0  62445208.0
         69526008.0  71806808.0  74087608.0  76368408.0  78649208.0  80930008.0  83210808.0  85491608.0
         88092408.0  91013208.0  93934008.0  96854808.0  99775608.0 102696408.0 105617208.0 108538008.0
        106658808.0 110219608.0 113780408.0 117341208.0 120902008.0 124462808.0 128023608.0 131584408.0
        125225208.0 129426008.0 133626808.0 137827616.0 142028416.0 146229216.0 150430016.0 154630816.0
        143791616.0 148632416.0 153473216.0 158314016.0 163154816.0 167995616.0 172836416.0 177677216.0

Press ESC to Exit...

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////</description>
      <pubDate>Tue, 02 Apr 2013 05:08:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917265#M12704</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-04-02T05:08:38Z</dc:date>
    </item>
    <item>
      <title>Hi Vineet,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917266#M12705</link>
      <description>Hi Vineet,

Here are a couple of notes and in overall try the same set of command line options for both platforms ( options below are for Windows ):

- Use the same Instruction set, for example SSE2 ( &lt;STRONG&gt;/QxSSE2&lt;/STRONG&gt; ), or SSE4.2 ( &lt;STRONG&gt;/QxSSE4.2&lt;/STRONG&gt; )

- Use &lt;STRONG&gt;/fp:precise&lt;/STRONG&gt;, &lt;STRONG&gt;/Qprec&lt;/STRONG&gt;, &lt;STRONG&gt;/Qpc:64&lt;/STRONG&gt; or &lt;STRONG&gt;/Qpc:80&lt;/STRONG&gt; with &lt;STRONG&gt;/Qlong-double&lt;/STRONG&gt; ( it enables &lt;STRONG&gt;80-bit&lt;/STRONG&gt; 'long double' data type when Intel C++ compiler is used )

- OpenMP is used on the Linux platform and I don't see &lt;STRONG&gt;/Qopenmp&lt;/STRONG&gt; switch on Windows platform

- Verify an OpenMP report with &lt;STRONG&gt;/Qopenmp-report{ 0| 1| 2 }&lt;/STRONG&gt; ( it controls the OpenMP parallelizer diagnostic level )</description>
      <pubDate>Tue, 02 Apr 2013 23:24:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inversion-precision-issues-on-two-different-processors/m-p/917266#M12705</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-04-02T23:24:00Z</dc:date>
    </item>
  </channel>
</rss>

