Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL DSS results change in 17.0.1

AndrewC
New Contributor III
1,154 Views

I have upgraded from Intel Parallel Studio 15.0 to 17.0 Update 1 and am noticing that MKL DSS (aka PARDISO) is giving somewhat different results when solving large real, symmetric sparse matrices. Is this a known issue? I am seeing the differences on both Windows and Linux. For example, has the reordering strategy changed or something like that?

0 Kudos
31 Replies
Gennady_F_Intel
Moderator
872 Views

Andrew, strategy is the same. what do you mean by "different results"?  how big this difference is? is that on the same system with the same environment?

 

0 Kudos
AndrewC
New Contributor III
872 Views

On the identical matrix I get different answers for the number of eigenvalues returned by dss_statistics. Same system, same matrix, but different MKL versions.  I also get some different eigenvalues - maybe by 1%-5%. Enough to fail our many of our QA tests, I can see that I am going to have to put together a reproducer. It will take a few days as it has to include matrix I/O.

FYI, this code has produced identical results for all "recent" MKL versions (that included a working DSS), until MKL that is part of Composer XE 2017.

 

 

	int error = dss_factor_real( dss_handle, type, KMinusSigmaM->x );
	if ( error == MKL_DSS_SUCCESS ){
		_CHARACTER_t statIn[] = "Inertia,FactorTime,ReorderTime,Flops";
		_DOUBLE_PRECISION_t statOut[10];
		error = dss_statistics(dss_handle, opt, statIn, statOut);
		if ( error == MKL_DSS_SUCCESS ){
			int nevpos=statOut[0];
			int nevneg=statOut[1];
			int nzero=statOut[2];

 

0 Kudos
mecej4
Honored Contributor III
872 Views

Depending on the properties of the matrix, it may not be meaningful to ask for and report its inertia. The documentation at https://software.intel.com/en-us/node/470322 says:

Computing inertia can lead to incorrect results for matrices with a cluster of eigenvalues which are near 0.

Inertia of a k-by-k real symmetric positive definite matrix is always (k, 0, 0). Therefore Inertia is returned only in cases of real symmetric indefinite matrices. For all other matrix types, an error message is returned.

I tried the DSS solver on several symmetric matrices from the Matrix Market, namely, BCSSTK01, BCSSTK02, BCSSTK13 and BCSSTK14. The solutions agreed with the expected solutions to ten significant digits, whether I used MKL 10.2.7, 11.3.4 or 2017.0.1.

0 Kudos
AndrewC
New Contributor III
872 Views

I can output the matrix in MM format, do any of the MKL DSS examples come with MM I/O - or can you share your test code that reads MM and runs the MKL DSS.

 

 

0 Kudos
mecej4
Honored Contributor III
872 Views

I used the Harwell-Boeing format files from the Matrix Market. The program is in Fortran, and sets the r.h.s. vector so as to generate the solution x = 1 + (i-1.0)/(n-1.0), i=1,..,n. It ignores any r.h.s. data in the input file.

If that seems useful to you, I can provide it.

0 Kudos
AndrewC
New Contributor III
872 Views

It turns out we are both right....

I create a test program and was able to confirm that , in fact, the latest MKL DSS produced the same results as before.... EXCEPT when the calling sequence is as per my snippet below where I call dss_factor_real twice after changing the "values" of the matrix, not the structure.

This used to work perfectly fine with prior versions of MKL. WIth MKL 2017.1, the second dss_factor_real gives different results than prior versions.

If I delete the complete dss_ data structures, and start again for the second dss_factor_real then results agree across MKL versions.

 

		dss_create(dss_handle, MKL_DSS_MSG_LVL_WARNING + MKL_DSS_ZERO_BASED_INDEXING);
		dss_define_structure(dss_handle,  MKL_DSS_SYMMETRIC, m3->p, m3->m, m3->n, m3->i, m3->nzmax);
		dss_reorder(dss_handle, MKL_DSS_AUTO_ORDER, 0);
		dss_factor_real(dss_handle, MKL_DSS_INDEFINITE, m3->x);
			
			// modify the  m3->x values ( NOT STRUCTURE!)  of matrix and call dss_factor_real again.
			
		dss_factor_real(dss_handle, MKL_DSS_INDEFINITE, m3->x);

 

0 Kudos
AndrewC
New Contributor III
872 Views

For example, these are the results I am getting for a typical problem making repeated calls to dss_factor_real after only changing the 'values' vector , not the structure of the matrix. To get the correct values I have to call dss_delete and dss_create again.

nevpos:79
nevneg:0
nzero:0
factor of second matrix ( these values are wrong!)
nevpos:65
nevneg:14
nzero:0
factor of second matrix, again! after full initialization( these values are correct!)
nevpos:73
nevneg:6
nzero:0

 

 

0 Kudos
mecej4
Honored Contributor III
872 Views

vasci_ wrote:
 For example, these are the results I am getting for a typical problem making repeated calls to dss_factor_real after only changing the 'values' vector , not the structure of the matrix. To get the correct values I have to call dss_delete and dss_create again.

If that happens, it does not happen always, so you need to provide details (preferably complete example code) to reproduce the problem.

Apart from the issues of reproducing the problem, there is one question that occurs to me: what useful end does it serve to factorize a matrix and, before using the computed factors, alter some of the entries and redo the factorization? All but the last factorization are nearly useless and wasted, unless the estimate of the inertia is, in itself, a desired result. I do not know the internal details of Pardiso/DSS, but I do not think that any actual eigenvalue computations are carried out (doing so would be quite costly) -- the inertia estimate and determinant are computed as by-products of the factorization. Changing some matrix entries can change the eigenvalues, so why are you surprised when, for your matrix, the inertia changed? Unless you did an independent calculation of the eigenvalues or the inertia, how do you claim that the results reported by DSS are wrong?

I took the example code pardiso_sym_c.c that is provided (in zipped form, in examples_core_c.zip) and modified it as follows:

/* -------------------------------------------------------------------- */
/* .. Numerical factorization. */
/* -------------------------------------------------------------------- */
    phase = 22;
    a[0]=a[0]+10.0; a[17]=a[17]+5.0;
    for(int irep=0; irep <5; irep++){
       a[0]-=2.0; a[17]-=1.0;
       PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, 
             &ddum, &ddum, &error);
       if ( error != 0 ){
          ...
          }
       printf ("\nFactorization completed ... Inertia = %d %d",iparm[21],iparm[22]);
    }

The modified example ran and printed the following pertinent line five times:

Factorization completed ... Inertia = 5 3

This is a case that fits your description. 

0 Kudos
AndrewC
New Contributor III
872 Views

I I think you are missing the point of what my test shows.

 

  • ​​I initialize DSS and compute the inertia of a matrix 
    • The steps are dss_create, dss_define_structure, dss_order, dss_factor, dss_statistics
  • I change the values (not structure) of the matrix and re-factor, and get the inertia ( the 65,14 values). These values are incorrect
    • The steps are dss_factor, dss_statistics
  • To show they are incorrect , I re-initialize DSS and get the inertia again ( 73,6). These values are correct.
    • The steps are dss_create, dss_define_structure, dss_order, dss_factor, dss_statistics

As to "why" I do this? This is a pretty effective way to find out the # of eigenvalues below a certain frequency (sigma) in the classic vibration problem. Often it is critical to find the # of eigenvalues  below multiple frequencies without actually computing them. As the matrix structure does not change, there is no need to re-analyze the matrix.

I have a fully working example, and am going to submit a Premier Support Ticket on this. 

As I said this is a regression failure in MKL. MKL versions part of Composer XE 2015 and lower do not have this issue.

 

 

 

 

 

0 Kudos
AndrewC
New Contributor III
872 Views

Sample project attached for Visual Studio 2015. I use CSparse to read and manipulate the matrices. I also check the structure of each matrix is identical.

DSSTest Matijv_1.txt Matijv_2.txt

0 Kudos
AndrewC
New Contributor III
872 Views

Just FYI, Intel Premier Support is completely broken. It is impossible to submit an issue.
 

0 Kudos
AndrewC
New Contributor III
872 Views

Update: Premier support came back to life, I was able to submit the issue

0 Kudos
Alexander_K_Intel2
872 Views

Hi All,

Can I ask to attach reproducer to check it quickly on my side? It can simplify process of finding issue

Thanks,

Alex

0 Kudos
AndrewC
New Contributor III
872 Views

Reproducer as a Visual Studio 2105 project is attached  in this thread couple of posts up.

0 Kudos
AndrewC
New Contributor III
872 Views

FYI Premier support ticket #6000164385 

0 Kudos
Gennady_F_Intel
Moderator
872 Views

I see no issues with MKL 2017 u1 and version 11.3.4 as well. I used your project and only things i did, changed the working directory. 

MKL Version:Intel(R) Math Kernel Library Version 2017.0.1 Product Build 20161005 for Intel(R) 64 architecture applications
nevpos:79
nevneg:0
nzero:0
Calling factor of second matrix
nevpos:73
nevneg:6
nzero:0
Calling factor of second matrix, again! after full initialization
nevpos:73
nevneg:6
nzero:0
Press any key to continue . . .

MKL Version:Intel(R) Math Kernel Library Version 11.3.4 Product Build 20160823 for Intel(R) 64 architecture applications
nevpos:79
nevneg:0
nzero:0
Calling factor of second matrix
nevpos:73
nevneg:6
nzero:0
Calling factor of second matrix, again! after full initialization
nevpos:73
nevneg:6
nzero:0
Press any key to continue . . .

 

0 Kudos
Gennady_F_Intel
Moderator
872 Views

Andrew, I also checked the problem you reported on different CPU types ( sse2, sse4.2, AVX and AVX2). resutls are the similars:

>>>DSSTest.exe Matijv_1.txt Matijv_2.txt
Major version:           2017
Minor version:           0
Update version:          1
Product status:          Product
Build:                   20161005
Platform:                Intel(R) 64 architecture
Processor optimization:  Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors
================================================================

nevpos:79
nevneg:0
nzero:0
Calling factor of second matrix
nevpos:73
nevneg:6
nzero:0
Calling factor of second matrix, again! after full initialization
nevpos:73
nevneg:6
nzero:0


>>>DSSTest.exe Matijv_1.txt Matijv_2.txt
Major version:           2017
Minor version:           0
Update version:          1
Product status:          Product
Build:                   20161005
Platform:                Intel(R) 64 architecture
Processor optimization:  Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled processors
================================================================

nevpos:79
nevneg:0
nzero:0
Calling factor of second matrix
nevpos:73
nevneg:6
nzero:0
Calling factor of second matrix, again! after full initialization
nevpos:73
nevneg:6
nzero:0


>>>DSSTest.exe Matijv_1.txt Matijv_2.txt
Major version:           2017
Minor version:           0
Update version:          1
Product status:          Product
Build:                   20161005
Platform:                Intel(R) 64 architecture
Processor optimization:  Intel(R) Streaming SIMD Extensions 2 (Intel(R) SSE2) enabled processors
================================================================
nevpos:79
nevneg:0
nzero:0
Calling factor of second matrix
nevpos:73
nevneg:6
nzero:0
Calling factor of second matrix, again! after full initialization
nevpos:73
nevneg:6
nzero:0

0 Kudos
mecej4
Honored Contributor III
872 Views

Vascii_ and Gennady: this is a tricky case! I am able to reproduce Vascii_'s results of #8 on one PC, but not on another, even though I used the same versions of MKL on both machines. I need more time to investigate.

0 Kudos
AndrewC
New Contributor III
872 Views

I am glad someone can reproduce! Hopefully this one will be an interesting one to track down.

I see the problem on multiple machines using MKL 2017.0.1

Machine #1 Windows 7 Intel(R) Xeon(R) CPU E5-1620 v2 @ 3.70GHz, 3701 Mhz, 4 Core(s), 8 Logical Processor(s)

Machine #2 Centos 6.7 Xeon E31245 @ 3.3Hhz, 8 cores.

Machine#3 Windows 7 Intel(R) Xeon(R) CPU E5-1650 0 @ 3.20GHz, 3201 Mhz, 6 Core(s), 6 Logical Processor(s)

Machine#4 Windows 7 Intel(R) Xeon(R) CPU E3-1290 V2 @ 3.70GHz, 3701 Mhz, 4 Core(s), 8 Logical Processor(s)

 

 

 

0 Kudos
mecej4
Honored Contributor III
839 Views

I am not much closer to an explanation, but here is a question that came up.

The two matrix data files that you attached, Matijv_1.txt and Matijv_2.txt, contain no entries in their last rows. In other words, these matrices are 79 X 79, but row 79 is empty. Another interpretation is that the matrices are 78 X 79, but that would violate the DSS requirement for the matrices to be square.

Please clarify.

0 Kudos
Reply