Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

PARDISO crashing 'sometimes' on identical imput (reproducible)

thomas_w_5
Beginner
779 Views
Hi. I have a project that worked using vs 2013 and an earlier version
of PARDISO. I just upgraded to vc++ from vs2015 and the current version of mkl
(already to update 2, released a few days ago).
 
Now, PARDISO crashes - sometimes - using identical input data.
 
That is, using the same parms and input values, it will sometimes
complete w/o error, sometimes on the 2nd attempt, sometimes on
the 110th and so on. Given enough attempts, it allways crashes using
that data.
 
i have extracted a small example program that shows the errror.
Basically, its a loop that reads the same data and calls pardiso. i have
tested this on various machines and the crash allways happens - not
allways after the same loopcount, but it does happen.
 
 
The resulting binary is missing the
solver DLLs that i use, but those are in the precompiled binaries zip.
 
 
To test this, the program needs the path to the data files as
1st parameter. e.g.g: "c:>mkltest d:\temp"
 
 
Any ideas what might be causing this and what i can do about this?
 
WM_THX
-thomas woelfer
 
0 Kudos
26 Replies
Gennady_F_Intel
Moderator
638 Views

Thomas, thanks for the issue, we will have a look at this case. Have you identified at which Pardiso's phase the crush happens?

and what were the previous version of MKL ?  

0 Kudos
thomas_w_5
Beginner
638 Views
Gennady,
 
>> which phase
 
sorry, i have not. As you can see in the sample code, there is only one call to pardiso (and a final call for the cleanup), as i'm not calling it multiple times for the various phases.
 
>> previous version
 
i really don't know how to find this out. We did buy a full license at the time vc++ 2010 was current and used that together with the vc++ 2010 rtl when we changed to vc++ 2013. The corresponding version of libiompmd.dll was signed on march, 10, 2011. So it must have been a version that was current in 2011...
 
WM_THX
-thomas woelfer
0 Kudos
Gennady_F_Intel
Moderator
638 Views

Thomas,

sorry, I haven't look at the code you provided before sent the message. I mean the reordering, factorization or solution phases. We will try to reproduce the problem on our side and keep you updated with the status.

wrt to version on MKL - you may use mkl_get_version() routine to print the all info about which version of MKL has been used. You may also find out into MKL Manual Reference the example shows how to call this routine.

--Gennady   

0 Kudos
thomas_w_5
Beginner
638 Views

Gennady,

ok. i'm waiting for your update.

WM_THX
​-thomas woelfer

0 Kudos
thomas_w_5
Beginner
638 Views

hi Gennady.

​Can you tell me how long things like that (reproducing...) normally take?

WM_THX

-thomas woelfer
 

 

 

 

0 Kudos
Gennady_F_Intel
Moderator
638 Views

The problem has been checked with the latest 11.3 update 2 version of MKL.  The problem has not been reproduced on my side;

here is the output:

 ...... Intel(R) Math Kernel Library Version 11.3.2 Product Build 20160120 for Intel(R) 64 architecture applications 

Test 0
Test 1
Test 2

........

........

Test 4997
Test 4998
Test 4999

--- 

I used the same project but add the mkl version function: 

int len = 198;    char buf[198];    mkl_get_version_string(buf, len);    printf(" \n ...... %s \n\n", buf);

 

0 Kudos
thomas_w_5
Beginner
638 Views
hi Gennady.
 
Thank you. However, now i'm out of ideas. I added the call to mkl_get_version/printf just like you did.
Rebuilded (release,x64, VS compiler) and restartet.
 
See the same mkl version string that you posted.
 
However, it still crashes for me. Every single time (but not for every single attempt).
 
I assume you didn't change anything besides adding the call to mkl_get_version. So what am i looking at here? Any ideas of what i should check? What could be causing this?
 
WM_THX
-thomas woelfer
0 Kudos
Gennady_F_Intel
Moderator
638 Views

I modified your code by adding only mk_version function before the main cycle,

for (int i = 0; i < 5000; i++)    {         Test(i, argv[1]);     }

could you check the problem on your side with minimum degree reodering?     pardisoControl[1] = 0;  

 

 

0 Kudos
thomas_w_5
Beginner
638 Views
Hi Gennady.
 
>> with minimum degree reodering?     pardisoControl[1] = 0; 
 
Well, it does no longer crash with that setting.
 
I'm not sure about the consequences of that. Isn't it strange that it crashes
( with = 2) for me (on several machines) but not on your side?
 
Now what shoud i do?
 
WM_THX
-thomas woelfer

 

0 Kudos
Gennady_F_Intel
Moderator
638 Views

Hi Thomas,

at least here we see some sort of work around.

is it possible to split Pardiso's call at the tree phases ( phase == 11, 22 and then 33 ) to understand where the crash happens?

and are there any specific CPU type where the problem has happened?

-- Gennady 

0 Kudos
thomas_w_5
Beginner
638 Views
hi Gennady.
 
so i went back to
 
pardisoControl[1] = 2;
 
and split the phases like this:
 
int solvePhase = 11;
PARDISO(pardisoInternalMemory, &maxfct, &mnum, &mtype, &solvePhase, &dimension, values, rowStartIndices, columIndices, NULL, &nrhs, pardisoControl, &msglvl, rightHandSide, solution, &error);
 
int solvePhase = 22;
PARDISO(pardisoInternalMemory, &maxfct, &mnum, &mtype, &solvePhase, &dimension, values, rowStartIndices, columIndices, NULL, &nrhs, pardisoControl, &msglvl, rightHandSide, solution, &error);
 
int solvePhase = 33;
PARDISO(pardisoInternalMemory, &maxfct, &mnum, &mtype, &solvePhase, &dimension, values, rowStartIndices, columIndices, NULL, &nrhs, pardisoControl, &msglvl, rightHandSide, solution, &error);
 
i also added some printf() in order to know which of the phases were running.
 
It crashed reliably when solcePhase == 33
 
Concerning the CPUs:
- My main machine (where i do most of the testing) is an
   Intel Core i5-4690 @ 3.5 GHz
  
- Another 2 test machine(s) (that i only used to repro the crash) are  
   Intel Core i5-3570 @ 3.4 GHz

​(i can get hold of other test machines if you want me to. i _did_ test on some other machines that i did not look up the cpu types. Do you want me to do this?)
  

WM_THX
-thomas woelfer
0 Kudos
schulzey
New Contributor I
638 Views

I have the same issue (see my post today entitled "Flakey Pardiso since MKL 11.3"). Makes me nervous about Pardiso now!

​It worked fine with MKL 11.2 and earlier versions and just started having the problem with MKL 11.3. I have found that if I put the MKL 11.2 runtime DLLs with my application compiled for MKL 11.3 it seems to work Ok, although that's not an ideal solution.

It would be good to get it fixed properly in MKL 11.3.

Regards,

Peter

0 Kudos
schulzey
New Contributor I
638 Views

Just a further thought - we have only had the problem on various laptops (with i7 processors) and haven't been able to reproduce it on a desktop so far. All our computers have Intel processors.

Gennady could this be why you can't reproduce it?

Cheers,

Peter

0 Kudos
Gennady_F_Intel
Moderator
638 Views

I still couldn't see the problem on my side.

Here is the environment I used for reproducing the issue : Win 8.1; MKL 11.3 update 2; static and dynamic linking; LP64.

CPU: Intel(R) Core(TM) i5-4300U CPU.

The log file for all 5000 iteration is attached.

I will check the case on i7 a little bit later as well I will ask the Pardiso's developers to help with reproducing the problem.

0 Kudos
schulzey
New Contributor I
638 Views

I think you might need to try it on an i7 laptop. Our i7 desktops don't seem to have the issue. We are running Windows 10 64-bit.

As mentioned, if we just put the MKL 11.2 runtime DLLs with our MKL 11.3 application it fixes the problem. Does this provide a clue? Are we likely to have other problems if we do this? Note that the only MKL routines we are using are Pardiso and FEAST.

0 Kudos
Gennady_F_Intel
Moderator
638 Views

Thomas, I was able to reproduce the issue on my side. We will investigate the problem and update this thread. 

0 Kudos
schulzey
New Contributor I
638 Views

Hi, are there any developments on this issue yet?

0 Kudos
schulzey
New Contributor I
638 Views

I noticed that MKL 11.3 update 3 has just been released. Has this issue been fixed in it? I can't see anything about it in the release notes.

0 Kudos
Gennady_F_Intel
Moderator
638 Views

the fix of this problem would available in the next update of MKL 11.3 ( update 4)

0 Kudos
schulzey
New Contributor I
563 Views

Is there any estimate of when MKL 11.3 (update 4) will be coming out? It has been over 6 months now since the problem was reported.

0 Kudos
Reply