During the solve step of Pardiso of the system A*X=B where B is multiple right-hand-side we see that the memory consumed by Pardiso is highly dependent on the number of cores. So we observe mem(total) = mem(A) + mem(X) + #cores * mem(B). The documentation mentions this indirectly, but we like to have tighter control over the memory of Pardiso.
One approach we can use is to split the right hand sides B wrt the number of cores so that on various systems the same amount of memory is used int the end.
Is there another to ensure that the workspace that Pardiso allocates is limited to a given number times of B, independent on the number of cores that will be used for solving? Ideally we want to benefit from the parallelism but not pay up the memory cost or at least find a better balance where we incur some cost in performance but not all by plainly reducing the number of OMP threads used. I did not find any setting that gave that level of control except reducing the number of threads.
Got this indeed from iparm, we also trapped malloc to confirm. It is the threaded 2018 version of MKL. We saw the same behavior on 2017, not so on earlier versions we believe (no hard confirmation for that though).