Different results getrf/getrs, dss and intel pardiso

Horst · ‎01-30-2019

Dear all,

I have a small nonsymmetric linear system that is represented by a matrix in csr format (file fort.106). The task is to solve the system. To this end, I applied three different approaches. At first, I transformed the three csr-vectors to a dense matrix with the help of mkl_ddnscsr. Using getrf/getrs (methbutton=6) solves the system and produces reasonable results. Using the sparsity of the system, I applied intel dss (methbutton=7). However, the results obtained with this method differ from the results of getrf/getrs far beyond machine precision. Going one step further to intel pardiso (methbutton=8), produces:

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Things, I have tried to avoid the problems:

-https://software.intel.com/en-us/articles/determining-root-cause-of-sigsegv-or-sigbus-errors

-checked the sparse matrix with sparse matrix checker routines (no error)

In principle, I would have said that my matrix is the problem. However, then I would guess that getrf/getrs doesn't work either. However, since it does work, I guess the solvers are somehow the issue.

You can find my code attached (2Modes.f90). The vectors representing the matrix can be found in fort.106. The program automatically reads the vector, so you can just compile and run it. Compiling works fine with ifort -o 2Modes.out 2Modes.f90 ${MKLROOT}/lib/intel64/libmkl_blas95_ilp64.a ${MKLROOT}/lib/intel64/libmkl_lapack95_ilp64.a -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_ilp64.a ${MKLROOT}/lib/intel64/libmkl_intel_thread.a ${MKLROOT}/lib/intel64/libmkl_core.a -Wl,--end-group -liomp5 -lpthread -lm -ldl -i8 -I${MKLROOT}/include/intel64/ilp64 -I${MKLROOT}/include .

The methods can be switched with the methbutton in line 41. I tried to keep the code as simple as possible. In principle, I have extracted the code from the examples which intel provides. It would be nice, if someone could take a look at this. Thank you in advance.

Best,

Horst K.

mecej4 · ‎01-30-2019

The source code contains a number of INCLUDE directives, but those included files are not present in the zip file, so it is impossible to build and test the program.

Gennady_F_Intel · ‎01-30-2019

please also check the latest version of mkl ( 2019.1) and let us know if pardiso continue producing SIGSEGV.

Horst · ‎01-31-2019

Thank you for your reply. I have attached a new zip file that contains all the include files you need. Sorry about that. I have just tried to run the code with the latest version, however, I can reproduce SIGSEGV.

mecej4 · ‎01-31-2019

When I added

iparm(27) = 1

after the call to pardisoinit() (in order to have Pardiso check the matrix data) and ran your program with Ifort 18.0.5 on Windows, the output was:

*** Error in PARDISO  (incorrect input matrix  ) error_num= 24
*** Input check: j=2928, ja(j)=83, ja(j+1)=79 are incompatible
 Reordering and Symbolic Factorization wrong:           -1

See lines 15924 and 15925 of your data file.

The CSR representation used by Pardiso requires that, for any row, the entries must be sorted so that the column numbers increase. Please investigate and correct this problem, and retry.

Since the MKL implementation of DSS is simply a wrapper for Pardiso, one should suspect that the same error may have been responsible for the incorrect results from DSS.

Horst · ‎01-31-2019

Thank you for your reply. I am wondering about this error, because I have checked the matrix after reading with the mkl sparse matrix checker routine (as you can see in the code I attached now). The routine has as an output MKL_SPARSE_CHECKER_SUCCESS and hence, I continued with the calculation.

mecej4 · ‎01-31-2019

I have never used sparse_matrix_checker() before (I did not know that it existed; it has the flavor of a C-oriented design).

I added the following lines to your program, after the lines that read the data from the file:

do k1=1, DimensionL
   do l1=ia(k1)+1,ia(k1+1)-1
      if(ja(l1).le.ja(l1-1))then
         print 11,k1,l1-1,ja(l1-1),l1,ja(l1)
   end do
end do

11 format(' Row ',i4,' has ja(',i4,') = ',i4,' and ja(',i4,') = ',i4)

and found that there are more errors in the data (if I am not mistaken):

Row   25 has ja( 724) =   34 and ja( 725) =   28
Row   26 has ja( 763) =   33 and ja( 764) =   27
Row   29 has ja( 879) =   34 and ja( 880) =   28
Row   30 has ja( 918) =   33 and ja( 919) =   27
Row   31 has ja( 957) =   32 and ja( 958) =   28
Row   32 has ja( 996) =   31 and ja( 997) =   27
Row   77 has ja(2656) =   86 and ja(2657) =   80
Row   78 has ja(2695) =   85 and ja(2696) =   79
Row   81 has ja(2811) =   86 and ja(2812) =   80
Row   82 has ja(2850) =   85 and ja(2851) =   79
Row   83 has ja(2889) =   84 and ja(2890) =   80
Row   84 has ja(2928) =   83 and ja(2929) =   79

You would know more about how the data should be, when correct, and may need to spend some time checking things out in this regard.

I could, of course, be mistaken, but since I added a small number of lines of code for doing the checking, it should be easy to verify.

Horst · ‎01-31-2019

Thank you for your reply. Since the matrix passed the checker without any error I did not consider that the matrix is wrong. Thank you for your help!