Does PARDISO have internal limits?

Nan_Deng · ‎05-12-2011

We programmed PARDISO into a program and use it to solvelarge problems. First we solved a problem with a global matrix of N=464,352 DOFsand the matrix has non-zero entries of 255,008,082. PARDISO was able to solve it in a little more than 14,000 seconds calendar time. The CPU time would, of course, be much more. We use a HP workstation with 12 corePentiumand 96 GB of memory. The maximum memory usage is about 21GB for the solving process. The physical problem is thenmodified and we have now N=475,152 and non-zero entries of 550,890,282. Then PARDISO stops at Phase=11 - symbolic factorization stage without giving any printout error messages or on-screen error messages.

My questions are: (1) Does PARDISO have any internal limit? If yes, how can we change it? (2) Is there anyway to let PARDISO or the system print out out where is wrong? (3) Can any guru help me give some hint on what could go wrong? singular matrix? Wrong index array? Wrong memory setup? Stack? Heap? Virtual Memory?

Thank you very much in advance.

Gennady_F_Intel · ‎05-12-2011

I think you are swapping with this task. Please try to use OOC version of PARDISO ( and ILP64 binaries have to be linked in such cases of course). It should help you to solve this task.

- also it would be helpfull to see the message level information - set the option msglvl == 1 and all PARDISO will print the statical info on the screen but this is not any debug info as you asked to see.

- also you can try to setiparm(27)=1, then PARDISO check integer arraysiaandja. In particular, PARDISO checks whether column indices are sorted in increasing order within each row.

Nan_Deng · ‎05-13-2011

I tried to use the OOC version. Following the PARDISO manual coming with the compiler's MKL, I specified iparm(60)=2, my link profile include mkl_intel_lp64.lib (Is this the ILP64 binaries mentioned in your message?), made the three line set up, specified Max core size N = 90000 (MB I suppose). I get msglvl = 1, iparm(27)=1, etc. I put printout messages before and after the calls for PARDISO so I would know where exactly the program stops.

The PARDISO routine stopped INSIDE the call for symbolic factorization without any message from the system. It's just like a normal program ending - except no solution is generated and any subsequent routines are not executed. (It never printout the"exit" message after the call) From the task manager, the memory usage before the stop was briefly peaked at a little over 20GB, but I should have plenty of system resources as my RAM is 96 GB.

It looks like somehow the routine get interrupted before any real work inside PARDISO is done so there was none diagnostic message generated. Is this a bug or an internal limit? How can I debug it? Any suggestions aregreatly appreciated. BTW, my compiler version is 11.1.051, 64bit machine, of course.

Sergey_Solovev__Inte · ‎05-15-2011

Please, link ILP64library (mkl_intel_ilp64.lib). It should helpyou. If you use LP64 library (mkl_intel_lp64.lib), youcan use pardiso_64 interface. It is an alternative ILP64 (64-bit integer) version of thePARDISO routine.

Nan_Deng · ‎05-15-2011

I tried first to revise my physical problem so I can test the effect of different sizes of the matrix. It appears that if the total number of non-zero items in the matrix less than 2^29 (1/2 giga) the PARDISO routine runs smoothly (My matrix has complex*16 data type). However, if the total number of non-zero items is greater than 2^29, PARDISO stopped without generating any messages. In all the runs, the program is compiled with mkl_intel_lp64.lib in the linker option input/additional dependency (Is this the right place?)

Following your suggestion, I changed the mkl_intel_lp64.lib to mkl_intel_ilp64.lib and recompiled and linked the program. This time, no matter what isthe matrix size, PARDISO always giveserror message code -1 "Input Inconsistent". It print out"Error_num 7" but didn't give any explanations. I have tried on several small sizedproblemsof different natures (~200K to 5MB for matrix size) and I got the error message from all of them. Given the fact that these different problems are QA'ed physically by other means. it is unlikely the program has code errors. But since the program is modular. The module contains PARDISO need to read binary data from other modules before its execution. I suspect the "inconsistency" could be from this data exchange. So in order to compile PARDISO with ILP64 libs I need to specify the ILP64 libs to all modules preceding PARDISO. Please confirm if this is the case.

I must say I am not clear about the difference between ILP64 and LP64 libs, and the exact meaning of PARDISO messages since all these are not documented in the Intel Fortran documentation. Where can I find an explanation for these questions so I have a better understanding? Sorry if I asked simple questions since I am an engineer and only do fortran occasionally,so I may not know something basic to professional programmers.

Konstantin_A_Intel · ‎05-15-2011

Hi, the difference between LP64 and ILP64 is as follows: ILP64 interface accepts ALL integer data as 8-byte (64-bit). It means that in order to use ILP64 interface of MKL (or just PARDISO)correctlyyou should do:

1a) Replacemkl_intel_lp64.libwithmkl_intel_ilp64.lib(seems already done)

1b) Link with usual mkl_intel_lp64.lib, but call pardiso_64 function instead of pardiso.

You can choose between 1a and 1b, it's up to you.

2) Make all scalar and arrays data which you pass to pardiso (1a case) or pardiso_64 (1b case) 64-bit. For example, your mtype, ia, ja, iparm (and any other int data) should be declared as:

integer*8, allocatable ia(:), ja(:)..... ! Fortran, or

long long int *ia, *ja...... ; // C/C++

Regards,

Konstantin