Solved: Sparse PARDISO_64 solver and weird solutions

Jason_d_ · ‎01-23-2017

Hi there,

We are encountering inconsistent behaviour with the sparse PARDISO_64 solver.

We are trying to solve a linear system, which corresponds to the stationary distribution of a discrete-time Markov chain. The transition matrix can be generated for small or large state spaces. The application requires a solution for a large state space (very sparse transition matrix); I include code for a small state space (not very sparse) that reproduces the problem we're encountering. Oddly, when the transition matrix is < 101 in size (ie., 101 x 101), this code produces correct results. However, with matrix size >=101, it gives non-sense results including negative values.

I've run this same matrix in R, using its default dense linear solver, and it solves the system just fine and returns a vector that sums to one with all entries positive.

Can someone let us know if this is a bug? Any suggestions?

Thanks very much,

- Jason

To compile the example:

icc -o test ./tester.c -DMKL_ILP64 -std=c99 -g -m64 -fopenmp -lmkl_intel_ilp64 -lmkl_core -liomp5 -lmkl_intel_thread -lpthread -lmkl_avx -lmkl_vml_avx -lm -liomp5 -ldl -lrt

Output:

$ ./test

=== PARDISO: solving a real nonsymmetric system ===

0-based array is turned ON

PARDISO double precision computation is turned ON

Parallel METIS algorithm at reorder step is turned ON

Scaling is turned ON

Matching is turned ON

Summary: ( reordering phase )

================

Times:

======

Time spent in calculations of symmetric matrix portrait (fulladj): 0.000282 s

Time spent in reordering of the initial matrix (reorder) : 0.002224 s

Time spent in symbolic factorization (symbfct) : 0.000373 s

Time spent in data preparations for factorization (parlist) : 0.000009 s

Time spent in allocation of internal data structures (malloc) : 0.001029 s

Time spent in additional calculations : 0.000441 s

Total time spent : 0.004357 s

Statistics:

===========

Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >

number of equations: 101

number of non-zeros in A: 7649

number of non-zeros in A (%): 74.982845

number of right-hand sides: 1

< Factors L and U >

number of columns for each panel: 128

number of independent subgraphs: 0

number of supernodes: 9

size of largest supernode: 73

number of non-zeros in L: 7376

number of non-zeros in U: 1599

number of non-zeros in L+U: 8975

ooc_max_core_size got from config file=100000

ooc_max_swap_size got from config file=10000

ooc_path got from config file=/tmp

ooc_keep_file got from config file=1

=== PARDISO is running in In-Core mode, because iparam(60)=1 and there is enough RAM for In-Core ===

Percentage of computed non-zeros for LL^T factorization

1 % 2 % 3 % 4 % 5 % 27 % 100 %

=== PARDISO: solving a real nonsymmetric system ===

Single-level factorization algorithm is turned ON

Summary: ( factorization phase )

================

Times:

======

Time spent in copying matrix to internal data structure (A to LU): 0.000000 s

Time spent in factorization step (numfct) : 0.002051 s

Time spent in allocation of internal data structures (malloc) : 0.000871 s

Time spent in additional calculations : 0.000008 s

Total time spent : 0.002930 s

Statistics:

===========

Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b >

number of equations: 101

number of non-zeros in A: 7649

number of non-zeros in A (%): 74.982845

number of right-hand sides: 1

< Factors L and U >

number of columns for each panel: 128

number of independent subgraphs: 0

number of supernodes: 9

size of largest supernode: 73

number of non-zeros in L: 7376

number of non-zeros in U: 1599

number of non-zeros in L+U: 8975

gflop for the numerical factorization: 0.000497

gflop/s for the numerical factorization: 0.242160

Percentage of computed non-zeros for LL^T factorization

1 % 2 % 3 % 4 % 5 % 27 % 100 %

=== PARDISO: solving a real nonsymmetric system ===

Single-level factorization algorithm is turned ON

Summary: ( starting phase is factorization, ending phase is solution )

================

Times:

======

Time spent in copying matrix to internal data structure (A to LU): 0.000000 s

Time spent in factorization step (numfct) : 0.000978 s

Time spent in direct solver at solve step (solve) : 0.000152 s

Time spent in allocation of internal data structures (malloc) : 0.000225 s

Time spent in additional calculations : 0.000010 s

Total time spent : 0.001365 s

Statistics:

===========

Parallel Direct Factorization is running on 1 OpenMP

< Linear system Ax = b > < transpose >

number of equations: 101

number of non-zeros in A: 7649

number of non-zeros in A (%): 74.982845

number of right-hand sides: 1

< Factors L and U >

number of columns for each panel: 128

number of independent subgraphs: 0

number of supernodes: 9

size of largest supernode: 73

number of non-zeros in L: 7376

number of non-zeros in U: 1599

number of non-zeros in L+U: 8975

gflop for the numerical factorization: 0.000497

gflop/s for the numerical factorization: 0.507876

Solution complete

0 0.999995193211361765861511230469

1 0.000032578641480540682096034288

2 -0.001657514998954390250673895935

3 0.033771150451229914324358105659

4 -0.461805824206749093718826770782

5 4.749899498253853380447253584862

6 -38.771420601724457810632884502411

7 259.248515126662823604419827461243

8 -1449.667206926258586463518440723419

9 6874.129777303653099806979298591614

10 -27898.278384158944390946999192237854

11 97440.861928413447458297014236450195

12 -293504.794066837057471275329589843750

13 761189.228222305420786142349243164062

14 -1688760.591960122343152761459350585938

15 3160374.134989297483116388320922851562

16 -4847723.865611193701624870300292968750

17 5708551.324229651130735874176025390625

18 -4176587.558768271468579769134521484375

19 -647335.411046688444912433624267578125

20 7123983.262770898640155792236328125000

21 -10372849.399064248427748680114746093750

22 5097756.375352034345269203186035156250

23 8368298.486144787631928920745849609375

24 -20094661.334219779819250106811523437500

25 15530267.332568019628524780273437500000

26 10869635.995811181142926216125488281250

27 -45071205.791867606341838836669921875000

28 59302210.260278038680553436279296875000

29 -32906353.391984485089778900146484375000

30 -29776674.515638992190361022949218750000

31 100146018.075395077466964721679687500000

32 -142138732.192305266857147216796875000000

33 132670794.247007787227630615234375000000

34 -77887861.591384351253509521484375000000

35 16047679.964387292042374610900878906250

36 8486771.549898184835910797119140625000

37 8432116.450060281902551651000976562500

38 -19044070.376644946634769439697265625000

39 -14876318.802595624700188636779785156250

40 53651210.152546934783458709716796875000

41 -18599756.443840526044368743896484375000

42 -83578205.019382655620574951171875000000

43 132441139.780909150838851928710937500000

44 -44723190.570743650197982788085937500000

45 -90119098.758960306644439697265625000000

46 112308026.259973496198654174804687500000

47 -15883799.202348483726382255554199218750

48 -37476735.814083844423294067382812500000

49 -42720512.310896500945091247558593750000

50 136331725.799720644950866699218750000000

51 -94694026.197748795151710510253906250000

52 -24969756.509624961763620376586914062500

53 36123495.129054218530654907226562500000

54 95371312.336257725954055786132812500000

55 -187649724.744519829750061035156250000000

56 99580546.284361138939857482910156250000

57 61281447.953638210892677307128906250000

58 -75682784.307738944888114929199218750000

59 -94990590.347380638122558593750000000000

60 249469815.636476099491119384765625000000

61 -185170645.040623068809509277343750000000

62 -69561311.618086069822311401367187500000

63 291469651.987545907497406005859375000000

64 -296767916.465866982936859130859375000000

65 107424872.212943851947784423828125000000

66 100278309.586056604981422424316406250000

67 -174662996.371612936258316040039062500000

68 105230287.342239096760749816894531250000

69 10815686.119901048019528388977050781250

70 -81002433.999335929751396179199218750000

71 86416125.554535329341888427734375000000

72 -62056087.853065535426139831542968750000

73 40542349.147022008895874023437500000000

74 -26163544.408309187740087509155273437500

75 10472964.159998646005988121032714843750

76 5665027.077539607882499694824218750000

77 -11247268.616148518398404121398925781250

78 -535759.321254136040806770324707031250

79 23669191.133846260607242584228515625000

80 -44661930.400523439049720764160156250000

81 53449912.937561854720115661621093750000

82 -48916840.807791873812675476074218750000

83 36568077.467748269438743591308593750000

84 -23051005.968265801668167114257812500000

85 12468528.466729646548628807067871093750

86 -5845339.524531649425625801086425781250

87 2387800.565653024241328239440917968750

88 -851709.236374233034439384937286376953

89 265137.756044438341632485389709472656

90 -71829.217305896498146466910839080811

91 16846.539484900742536410689353942871

92 -3393.713058615743648260831832885742

93 580.705784613051491760415956377983

94 -83.109681426136376103386282920837

95 9.736892611661156493596536165569

96 -0.905525373004574585245052276150

97 0.063834158911049598827958106995

98 -0.003165205991651388536811673191

99 0.000096134640968314257489582553

100 -0.000008968170732259750366210938

mecej4 · ‎01-24-2017

I think that you have to debug your code and establish (i) if the matrix and right-hand-side vector are being formed with correct values and (ii) if the code calls Pardiso correctly.

I conducted an independent check using code that I had on hand. The attached zip contains three files -- (i) cnvmat.c, which outputs your matrix data in the form expected by (ii) the Fortran program unsymcsr.f90, and (iii) the converted file csv.mat.

The solution checks out. I suspect that your matrix is not correct in at least the following aspect: you make each of the elements in the last column of the matrix equal 1.0. If you want the sum of the solution values to equal 1.0, as you said, it is the elements of the last row of the matrix that should each equal 1.0. (Please check these statements. I have not studied your program in depth).

View solution in original post

Zhen_Z_Intel · ‎01-24-2017

Dear customer,

I checked with your code and data file, seems no problem with your setting of pardiso. I tried to use csrgemv to calculate y=Ax to verify your result. Seems there's indeed a problem. We will investigate on it. Thanks.

Best regards,
Fiona

mecej4 · ‎01-24-2017

I think that you have to debug your code and establish (i) if the matrix and right-hand-side vector are being formed with correct values and (ii) if the code calls Pardiso correctly.

I conducted an independent check using code that I had on hand. The attached zip contains three files -- (i) cnvmat.c, which outputs your matrix data in the form expected by (ii) the Fortran program unsymcsr.f90, and (iii) the converted file csv.mat.

The solution checks out. I suspect that your matrix is not correct in at least the following aspect: you make each of the elements in the last column of the matrix equal 1.0. If you want the sum of the solution values to equal 1.0, as you said, it is the elements of the last row of the matrix that should each equal 1.0. (Please check these statements. I have not studied your program in depth).

Jason_d_ · ‎01-24-2017

Thanks mecej4. You were right. The matrix was malformed. Once fixed, the code works

mecej4 · ‎01-24-2017

It is nice to hear that you have fixed the issue, but I still have one doubt that I do not see the answer to: what is special about a square matrix of size 101, and how could you have obtained correct results with matrices smaller than that despite having 1-s on the last column instead of the last row?