I have an application which uses Pardiso (pardiso_64) for large sparse matrix solves.
In running a series of tests for my application in parallel processes with different input, I have discovered that sometimes (but not always), when the matrix is singular (which in this application can happen and is OK), the application freezes. I have isolated the problem to the call to pardiso_64 by adding print statements before and after the call). If I rerun the failed test afterwards one by one (not in parallel with other tests), the execution of my application completes, and I can see the pardiso error code -4 (which I propagate to a text file through my application).
This issue is not isolated to a single input file, but happens randomly for some of the tests in every batch run. When I run each of the frozen tests one by one afterwards, they complete successfully I see the pardiso error code -4 in my application's output. So it only happens when the matrix is singular, and when my application is run in multiple processes at the same time. And only occasionally.
I'm running on a 2-core CPU (Intel i7). Pardiso uses 4 threads (up to a full core, I think). I have mkl 11.3 update 3.
Also, note that if I use mkl_set_num_threads(1) prior to the call to pardiso_64, there are no issues.
Also also, I use the parallel metis nested disection option, iparm(2) = 3. If I set iparm(2) = 0, there are no issues.
Are there any known issues related to what I observe?
Jens, Could you please try the latest update 4 ( 11.3.4) where we fixed some regression problem with METIS reordering and keep us updated with the status. best, Gennady
Sure, I will do that. However, short term I am happy with using the minimum degree algorithm, so I might not get around to it until next week.
Also, note that both METIS implementations (iparm(2) = 3 and iparm(2) = 2) causes the same behavior. So it's not exclusive to the parallel implementation.