- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I recently downloaded the latest MKL library to fix another unsymmetric solver failure. While the new version fixes the previous failure in my application the latest release fails on even a small test problem. The failure occurs if you trap for FPEs. Since our main code traps for all exceptions we cannot use Pardiso unless the source of this FPE is found and removed. It is easy to recreate the problem using the attached tar file. I used the GNU compiler and I created a simple test case by modifying the test program supplied with the MKL library solvers so that it would read in a small test matrix which I am trying to solve. The matrix is unsymettric and other linear solvers - a dense solver, and SuperLU solve the exact same linear system with no errors. Pardiso also seems to solve the system as long as you ignore the FPE. We cannot do this in our production code for quality reasons and I am sure the MKL group doesn't intend to have the FPEs occuring. I will paste in the traceback that appears to show that this error occurs in the log function called from within the factorization routine in Pardiso..
error: java.lang.Exception: error: Macro run-time error: SIGFPE: floating point exception
Command: InitializeSolution
CompletedCommand: InitializeSolution
In: []
Recoverability: Non-recoverable
ServerStack: [
libStarNeo.so: SignalHandler::signalHandlerFunction(int, siginfo*, void*),
libpthread.so.0,
libm.so.6,
libm.so.6(log+0x14),
libmkl_core.so(mkl_pds_lp64_mps_pardiso+0x360),
libmkl_core.so(mkl_pds_lp64_pardiso_c+0x238d),
libmkl_core.so(mkl_pds_lp64_pardiso+0x454),
libmkl_intel_lp64.so(PARDISO+0x86),
libSundials.so(callPARDISO+0xd7),
The attached tar file contains the input matrix and uses the rows sums as the right hand side. The expected solution then is all 1's. If you comment out the call to feenableexception() the test case will obtain the correct solution. With the call in the test program the code halts on the FPE.
The following code is added to trap FPEs..
#define _GNU_SOURCE
#include
....
feenableexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW);
The Makefile is set up to use gcc and also is set up to point to the MKL library version. You also must set LD_LIBRARY_PATH on your system to point to the location of the MKL library files.
I am running this case on a Linux Red Hat Dell workstation. The MKL verion I am running is 10.3.7.256
Gene Poole
Cd-adapco
error: java.lang.Exception: error: Macro run-time error: SIGFPE: floating point exception
Command: InitializeSolution
CompletedCommand: InitializeSolution
In: []
Recoverability: Non-recoverable
ServerStack: [
libStarNeo.so: SignalHandler::signalHandlerFunction(int, siginfo*, void*),
libpthread.so.0,
libm.so.6,
libm.so.6(log+0x14),
libmkl_core.so(mkl_pds_lp64_mps_pardiso+0x360),
libmkl_core.so(mkl_pds_lp64_pardiso_c+0x238d),
libmkl_core.so(mkl_pds_lp64_pardiso+0x454),
libmkl_intel_lp64.so(PARDISO+0x86),
libSundials.so(callPARDISO+0xd7),
The attached tar file contains the input matrix and uses the rows sums as the right hand side. The expected solution then is all 1's. If you comment out the call to feenableexception() the test case will obtain the correct solution. With the call in the test program the code halts on the FPE.
The following code is added to trap FPEs..
#define _GNU_SOURCE
#include
....
feenableexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW);
The Makefile is set up to use gcc and also is set up to point to the MKL library version. You also must set LD_LIBRARY_PATH on your system to point to the location of the MKL library files.
I am running this case on a Linux Red Hat Dell workstation. The MKL verion I am running is 10.3.7.256
Gene Poole
Cd-adapco
Link Copied
12 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gene, the behaviour is reproduced. we will investigate this case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
So, your policyis to not haveFPEs: FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW
or just to analyse reasons why they occur.
In my opinion,MKL should not produce internally:
NANs if they are absent in input data
DIV by zero or OVERFLOWs
But UNDERFLOWs are possible when there are denormals.
So could you please try to use flush-to-zero mode in your application when denormals becomeas zeros.
So, your policyis to not haveFPEs: FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW
or just to analyse reasons why they occur.
In my opinion,MKL should not produce internally:
NANs if they are absent in input data
DIV by zero or OVERFLOWs
But UNDERFLOWs are possible when there are denormals.
So could you please try to use flush-to-zero mode in your application when denormals becomeas zeros.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gene, the dividing by zero happened on the reordering stage, some investigations the cause of the problem needed to be done.
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Curiously, when the extra parameter DPARM is added to the Pardiso calls and the test program is compiled with GCC, linked with Pardiso 4.1.2/4.1.3 (32 or 64 bit) and run, the program runs to completion, giving a solution vector of ones.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Victor,
I will forward your questions to the development group at StarCCM+ to get an answer on your question about underflow. I did not isolate which type exeception was causing these failures. It does appear to be an execption which we can ignore but I cannot ignore it in our code - onlt the small test program.
Gene
I will forward your questions to the development group at StarCCM+ to get an answer on your question about underflow. I did not isolate which type exeception was causing these failures. It does appear to be an execption which we can ignore but I cannot ignore it in our code - onlt the small test program.
Gene
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you be a bit more specific on this extra parameter added to Pardiso calls? Also is Pardiso 4.1.2/4.1.3 a newer development version that what I am using?
Gene Poole
Gene Poole
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So I wonder if there is another reordering that I can use to bypass the issue? The systems we are solving are very small by sparse solver standards - but they get factored/solved many thousands of time potentially - so while sparse solver technology is a huge benefit it the reordering phase is not as critical as it might be for very large sparse systems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Versions 4.1.2 (stable) and 4.1.3 (development) are not available in the Pardiso bundled into MKL. They are available from the Pardiso developer at the project page, http://www.pardiso-project.org/ .
The extra parameter is extra in the sense that it is not present in Pardiso 3.x, which is the version included in MKL. Your code does not use the contents of the extra array, but this array needs to be provided as a dummy when calling Pardiso 4.x.
The extra parameter is extra in the sense that it is not present in Pardiso 3.x, which is the version included in MKL. Your code does not use the contents of the extra array, but this array needs to be provided as a dummy when calling Pardiso 4.x.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> ... the reordering phase is not as critical as it might be for very large sparse systems
I would want to run tests to verify that expectation.
If reordering could reduce the bandwidth by a factor of 2, assuming that your system is banded, the solve phase would run at only 50 percent of the possible speed if reordering were to be bypassed.
I would want to run tests to verify that expectation.
If reordering could reduce the bandwidth by a factor of 2, assuming that your system is banded, the solve phase would run at only 50 percent of the possible speed if reordering were to be bypassed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gene,
First of all, let me notice that the issue with division by zero was isolated and fixed internally. It will be hopefully fixed in the next update of MKL.
Then, I would recommend you to try any of these workarounds:
1) Switch-off matching parameter (iparm(13)=0)
2) Switch-off reordering (iparm(5)=1 and fill perm array with the sequence of values={1,2,..n}). As you noted correctly, if your matrix is very small, reordering hardly improve the performance (and may even slow-down computations). But please check whether it's true for your specific testcase.
Best regards,
Konstantin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Konstantin,
Thanks for the fast response! took a look at your suggested work arounds so here is an update from my test runs.
You proposed 2 possible work arounds; I tried 3::
1 ) Disable ordering only and use a perm array of int values 1,2, 3...
2) Disable the matching parameter option - iparm[12]=0 ( 0 based index for iparm )
3) Both 1 & 2
It seems that the matching parameter option is the source of the FPE rather than reordering. If fact, If Option 1 above fails with FPE on all of 3 test cases I have. Option 2 always gets me a factorization but there is some issue with the row Sum test for the smallest test case. I think an absolute value of 1e-4 is a bit too high for an error but I also did no rigorous analysis of condition number, etc. I do know that with your reordering enabled the error is much smaller when I do the row sum test. I also have no accuracy issues using SuperLU for this problem so it is unique to Pardiso at this point.
Option 3 always works just like 2 and also the Rows sum tests are good for all three test cases. Again the row sum test is "failing" only on the smallest test case - 242 equations.
I am going to attach an updated test program which accepts as input the filename of the test matrix. I have included all three test matrices. Our typical size is NOT 242 equations. That was just the smallest case I could get easily and it produced the error. More typical sizes are a few thousand equations with a fill ratio of around 5 ( these are basically poisson like equations ).
I looked at the cost of leaving out reordeing. It approximately doubles the size of L+U compared to the L+U when using the internal reordering option. The flops go up more than the size of L+U. It appears to be 3X or a little more, But, in these small problems it takes about as long to reorder as to factor so it wouldn't be too much slower as long as we don't use a much finer mesh in these systems that can lead to 40k or 50k equations.
So to summarize, it appears that the matching algorithm is teh real culprit here, not reordering. Turning off the matching algorithm alone fixes the FPE issue in my examples but it does introduce a noticeable error in the solution in one of my test cases.
Gene
Thanks for the fast response! took a look at your suggested work arounds so here is an update from my test runs.
You proposed 2 possible work arounds; I tried 3::
1 ) Disable ordering only and use a perm array of int values 1,2, 3...
2) Disable the matching parameter option - iparm[12]=0 ( 0 based index for iparm )
3) Both 1 & 2
It seems that the matching parameter option is the source of the FPE rather than reordering. If fact, If Option 1 above fails with FPE on all of 3 test cases I have. Option 2 always gets me a factorization but there is some issue with the row Sum test for the smallest test case. I think an absolute value of 1e-4 is a bit too high for an error but I also did no rigorous analysis of condition number, etc. I do know that with your reordering enabled the error is much smaller when I do the row sum test. I also have no accuracy issues using SuperLU for this problem so it is unique to Pardiso at this point.
Option 3 always works just like 2 and also the Rows sum tests are good for all three test cases. Again the row sum test is "failing" only on the smallest test case - 242 equations.
I am going to attach an updated test program which accepts as input the filename of the test matrix. I have included all three test matrices. Our typical size is NOT 242 equations. That was just the smallest case I could get easily and it produced the error. More typical sizes are a few thousand equations with a fill ratio of around 5 ( these are basically poisson like equations ).
I looked at the cost of leaving out reordeing. It approximately doubles the size of L+U compared to the L+U when using the internal reordering option. The flops go up more than the size of L+U. It appears to be 3X or a little more, But, in these small problems it takes about as long to reorder as to factor so it wouldn't be too much slower as long as we don't use a much finer mesh in these systems that can lead to 40k or 50k equations.
So to summarize, it appears that the matching algorithm is teh real culprit here, not reordering. Turning off the matching algorithm alone fixes the FPE issue in my examples but it does introduce a noticeable error in the solution in one of my test cases.
Gene
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gene,
Howthis issueis criticalfor you?
Whendo you expectthe solution to thisproblem?
Whendo you expectthe solution to thisproblem?
This is the private thread and you can share this info here.
--Gennady
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page