Link Copied
There are many reasons why you can't reproduce identical results, like:
1.Single-precision data type vs. Double-precision data type;
2.Differentimplementations ofthe samealgorithm (Rolled loops vs. Unrolled loops \ FP-emulator vs. SSE2 \ possible Vectorization);
3.If a GPU is used ( NVIDIA clearly states that results could be different );
4. Or, anything else, an error in calculations ( as already suggested )...
You could ran into troubles even with smaller matrices because of limitations of IEEE 754 standard ( especially for a single-precision data type ). Here is an example with 8x8 matrices:
// Matrix A - 8x8 - 'float' type:
101.0 201.0 301.0 401.0 501.0 601.0 701.0 801.0
901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0
// Matrix B - 8x8 - 'float' type:
101.0 201.0 301.0 401.0 501.0 601.0 701.0 801.0
901.0 1001.0 1101.0 1201.0 1301.0 1401.0 1501.0 1601.0
1701.0 1801.0 1901.0 2001.0 2101.0 2201.0 2301.0 2401.0
2501.0 2601.0 2701.0 2801.0 2901.0 3001.0 3101.0 3201.0
3301.0 3401.0 3501.0 3601.0 3701.0 3801.0 3901.0 4001.0
4101.0 4201.0 4301.0 4401.0 4501.0 4601.0 4701.0 4801.0
4901.0 5001.0 5101.0 5201.0 5301.0 5401.0 5501.0 5601.0
5701.0 5801.0 5901.0 6001.0 6101.0 6201.0 6301.0 6401.0
// Matrix C = Matrix A * Matrix B - 8x8 - 'float' type:
13826808.0 14187608.0 14548408.0 14909208.0 15270008.0 15630808.0 15991608.0 16352408.0
32393208.0 33394008.0 34394808.0 35395608.0 36396408.0 37397208.0 38398008.0 39398808.0
50959604.0 52600404.0 54241204.0 55882004.0 57522804.0 59163604.0 60804404.0 62445204.0
69526008.0 71806808.0 74087608.0 76368408.0 78649208.0 80930008.0 83210808.0 85491608.0
88092408.0 91013208.093934008.0 96854808.0 99775608.0 102696408.0 105617208.0 108538008.0
106658808.0 110219608.0 113780408.0 117341208.0 120902008.0 124462808.0 128023608.0 131584408.0
125225208.0 129426008.0 133626808.0 137827616.0 142028400.0 146229216.0 150430000.0 154630816.0
143791600.0 148632416.0 153473200.0 158314016.0 163154800.0 167995616.0 172836416.0 177677200.0
I've underlined all Inexactvalues.
Sorry that I couldn't answer your question completely.
Best regards,
Sergey
No, the result provided by PARDISO (1e-15 relative residual) is the most accurate that can be achieved in double precision arithmetics even theoretically.
iparm(12)
This parameter is reserved for future use. Its value must be set to 0.
iparm(12)
This parameter is reserved for future use. Its value must be set to 0.
I did not say anything about iparm(12) :) I referred to iparm[12] that is iparm(13) in Fortran.
Moreover, iparm(12) is also used in the latest version of MKL for new nice feature:
iparm(12)- solving with transposed or conjugate transposed matrix.
Regards,
Konstantin
For more complete information about compiler optimizations, see our Optimization Notice.