"Note that if iparm(60) is equal to 1 or 2, and the total peak
memory needed for strong local arrays is less than
MKL_PARDISO_OOC_MAX_CORE_SIZE, the program stops
with error -9. In this case, increase of
MKL_PARDISO_OOC_MAX_CORE_SIZE is recommended."
Let me clarify this instruction: PARDISO has two modes of execution: In-Core, when all LU factor are fit in RAM, and Out-of-core (OOC) otherwise. In the last case most of LU factors are stored on hard disk. OOC mode (iparm(60) = 1 or 2) requires information about memory which PARDISO can utilize - it determines by MKL_PARDISO_OOC_MAX_CORE_SIZE parameter; and PARDISO stops with error= -9 in the case when the specified amount memory is not enough for solving the task. In this case the amount of memory should be increased to solve it.
total peak memory > MKL_PARDISO_OOC_MAX_CORE_SIZE, PARDISO stops with error=-9. That makes perfect sense.
However, the manual seems indicate
total peak memory < MKL_PARDISO_OOC_MAX_CORE_SIZE, PARDISO stops with error=-9.
That's where my confusion is.
1. in-core vs OOC: it seems the most flexbile setting is iparm(60)=1 (letting PARDISO to decide). My question is: is iparm(60)=1 as efficient as explicit setting? For example, in article Tips using PARDISO(http://software.intel.com/en-us/articles/pardiso-tips/) , I found following:
To achieve the best performance, we do not recommend the use of the out-of-core (OOC) PARDISO for small matrices. We recommend using the in-core PARDISO for all cases where the memory required for storing PARDISO factors exceeds the RAM by less than 30%. The size of the factors in kbytes can be obtained with the help of iparm(17) after phase 11 (see Intel MKL reference manual)."
This seems indicate I should use the ohase 11 result to decide in-core or OOC and set iparm(60)=0 or iparm(60)=2 myself.
2. OOC not threading: I found following in article How to use OOC PARDISO (http://software.intel.com/en-us/articles/how-to-use-ooc-pardiso)
"The current OOC version does not use threading, thus both the parameter iparm(3) and the variable MKL_NUM_THREADS must be set to 1 when iparm(60) is equal to 1 or 2."
Does this mean OOC is running in sequential mode (instead of parallel shared memory mode)?
Answer on first question: both variants are possible. The sentence from manual that you mention means that sometimes in-core version of PARDISO could use HDD in swop and provide better performance than OOC PARDISO. As about second question: it does depend from version of MKL you used. If you use 10.2 update 2 of latter version then OOC PARDISO version is parallelized
With best regards,