Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
- FPE when trying to use Pardiso Unsymettric Solver

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
##

I recently downloaded the latest MKL library to fix another unsymmetric solver failure. While the new version fixes the previous failure in my application the latest release fails on even a small test problem. The failure occurs if you trap for FPEs. Since our main code traps for all exceptions we cannot use Pardiso unless the source of this FPE is found and removed. It is easy to recreate the problem using the attached tar file. I used the GNU compiler and I created a simple test case by modifying the test program supplied with the MKL library solvers so that it would read in a small test matrix which I am trying to solve. The matrix is unsymettric and other linear solvers - a dense solver, and SuperLU solve the exact same linear system with no errors. Pardiso also seems to solve the system as long as you ignore the FPE. We cannot do this in our production code for quality reasons and I am sure the MKL group doesn't intend to have the FPEs occuring. I will paste in the traceback that appears to show that this error occurs in the log function called from within the factorization routine in Pardiso..

error: java.lang.Exception: error: Macro run-time error: SIGFPE: floating point exception

Command: InitializeSolution

CompletedCommand: InitializeSolution

In: []

Recoverability: Non-recoverable

ServerStack: [

libStarNeo.so: SignalHandler::signalHandlerFunction(int, siginfo*, void*),

libpthread.so.0,

libm.so.6,

libm.so.6(log+0x14),

libmkl_core.so(mkl_pds_lp64_mps_pardiso+0x360),

libmkl_core.so(mkl_pds_lp64_pardiso_c+0x238d),

libmkl_core.so(mkl_pds_lp64_pardiso+0x454),

libmkl_intel_lp64.so(PARDISO+0x86),

libSundials.so(callPARDISO+0xd7),

The attached tar file contains the input matrix and uses the rows sums as the right hand side. The expected solution then is all 1's. If you comment out the call to feenableexception() the test case will obtain the correct solution. With the call in the test program the code halts on the FPE.

The following code is added to trap FPEs..

#define _GNU_SOURCE

#include

....

feenableexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW);

The Makefile is set up to use gcc and also is set up to point to the MKL library version. You also must set LD_LIBRARY_PATH on your system to point to the location of the MKL library files.

I am running this case on a Linux Red Hat Dell workstation. The MKL verion I am running is 10.3.7.256

Gene Poole

Cd-adapco

gene_poolecd-adapco_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-13-2012
06:20 PM

8 Views

FPE when trying to use Pardiso Unsymettric Solver

error: java.lang.Exception: error: Macro run-time error: SIGFPE: floating point exception

Command: InitializeSolution

CompletedCommand: InitializeSolution

In: []

Recoverability: Non-recoverable

ServerStack: [

libStarNeo.so: SignalHandler::signalHandlerFunction(int, siginfo*, void*),

libpthread.so.0,

libm.so.6,

libm.so.6(log+0x14),

libmkl_core.so(mkl_pds_lp64_mps_pardiso+0x360),

libmkl_core.so(mkl_pds_lp64_pardiso_c+0x238d),

libmkl_core.so(mkl_pds_lp64_pardiso+0x454),

libmkl_intel_lp64.so(PARDISO+0x86),

libSundials.so(callPARDISO+0xd7),

The attached tar file contains the input matrix and uses the rows sums as the right hand side. The expected solution then is all 1's. If you comment out the call to feenableexception() the test case will obtain the correct solution. With the call in the test program the code halts on the FPE.

The following code is added to trap FPEs..

#define _GNU_SOURCE

#include

....

feenableexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW);

The Makefile is set up to use gcc and also is set up to point to the MKL library version. You also must set LD_LIBRARY_PATH on your system to point to the location of the MKL library files.

I am running this case on a Linux Red Hat Dell workstation. The MKL verion I am running is 10.3.7.256

Gene Poole

Cd-adapco

12 Replies

Highlighted
##

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-13-2012
10:17 PM

8 Views

Gene, the behaviour is reproduced. we will investigate this case.

Highlighted
##

Hi,

So, your policyis to not haveFPEs: FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW

or just to analyse reasons why they occur.

In my opinion,MKL should not produce internally:

NANs if they are absent in input data

DIV by zero or OVERFLOWs

But UNDERFLOWs are possible when there are denormals.

So could you please try to use flush-to-zero mode in your application when denormals becomeas zeros.

barragan_villanueva_

Valued Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-13-2012
10:38 PM

8 Views

So, your policyis to not haveFPEs: FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW

or just to analyse reasons why they occur.

In my opinion,MKL should not produce internally:

NANs if they are absent in input data

DIV by zero or OVERFLOWs

But UNDERFLOWs are possible when there are denormals.

So could you please try to use flush-to-zero mode in your application when denormals becomeas zeros.

Highlighted
##

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-13-2012
10:56 PM

8 Views

Gene, the dividing by zero happened on the reordering stage, some investigations the cause of the problem needed to be done.

--Gennady

Highlighted
##

Curiously, when the extra parameter DPARM is added to the Pardiso calls and the test program is compiled with GCC, linked with Pardiso 4.1.2/4.1.3 (32 or 64 bit) and run, the program runs to completion, giving a solution vector of ones.

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-14-2012
05:41 AM

8 Views

Highlighted
##

Victor,

I will forward your questions to the development group at StarCCM+ to get an answer on your question about underflow. I did not isolate which type exeception was causing these failures. It does appear to be an execption which we can ignore but I cannot ignore it in our code - onlt the small test program.

Gene

gene_poolecd-adapco_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-15-2012
02:44 PM

8 Views

I will forward your questions to the development group at StarCCM+ to get an answer on your question about underflow. I did not isolate which type exeception was causing these failures. It does appear to be an execption which we can ignore but I cannot ignore it in our code - onlt the small test program.

Gene

Highlighted
##

Can you be a bit more specific on this extra parameter added to Pardiso calls? Also is Pardiso 4.1.2/4.1.3 a newer development version that what I am using?

Gene Poole

gene_poolecd-adapco_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-15-2012
02:46 PM

8 Views

Gene Poole

Highlighted
##

So I wonder if there is another reordering that I can use to bypass the issue? The systems we are solving are very small by sparse solver standards - but they get factored/solved many thousands of time potentially - so while sparse solver technology is a huge benefit it the reordering phase is not as critical as it might be for very large sparse systems.

gene_poolecd-adapco_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-15-2012
02:50 PM

8 Views

Highlighted
##

Versions 4.1.2 (stable) and 4.1.3 (development) are not available in the Pardiso bundled into MKL. They are available from the Pardiso developer at the project page, http://www.pardiso-project.org/ .

The extra parameter is extra in the sense that it is not present in Pardiso 3.x, which is the version included in MKL. Your code does not use the contents of the extra array, but this array needs to be provided as a dummy when calling Pardiso 4.x.

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-15-2012
08:35 PM

8 Views

The extra parameter is extra in the sense that it is not present in Pardiso 3.x, which is the version included in MKL. Your code does not use the contents of the extra array, but this array needs to be provided as a dummy when calling Pardiso 4.x.

Highlighted
##

*> ... the reordering phase is not as critical as it might be for very large sparse systems*

I would want to run tests to verify that expectation.

If reordering could reduce the bandwidth by a factor of 2, assuming that your system is banded, the solve phase would run at only 50 percent of the possible speed if reordering were to be bypassed.

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-15-2012
08:40 PM

8 Views

If reordering could reduce the bandwidth by a factor of 2, assuming that your system is banded, the solve phase would run at only 50 percent of the possible speed if reordering were to be bypassed.

Highlighted
##

Hi Gene,

Konstantin_A_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-20-2012
07:39 AM

8 Views

First of all, let me notice that the issue with division by zero was isolated and fixed internally. It will be hopefully fixed in the next update of MKL.

Then, I would recommend you to try any of these workarounds:

1) Switch-off matching parameter (iparm(13)=0)

2) Switch-off reordering (iparm(5)=1 and fill perm array with the sequence of values={1,2,..n}). As you noted correctly, if your matrix is very small, reordering hardly improve the performance (and may even slow-down computations). But please check whether it's true for your specific testcase.

Best regards,

Konstantin

Highlighted
##

Konstantin,

Thanks for the fast response!*took a look at your suggested work arounds *so here is an update from my test runs.

You proposed 2 possible work arounds; I tried 3::

1 ) Disable ordering only and use a perm array of int values 1,2, 3...

2) Disable the matching parameter option - iparm[12]=0 ( 0 based index for iparm )

3) Both 1 & 2

It seems that the matching parameter option is the source of the FPE rather than reordering. If fact, If Option 1 above fails with FPE on all of 3 test cases I have. Option 2 always gets me a factorization but there is some issue with the row Sum test for the smallest test case. I think an absolute value of 1e-4 is a bit too high for an error but I also did no rigorous analysis of condition number, etc. I do know that with your reordering enabled the error is much smaller when I do the row sum test. I also have no accuracy issues using SuperLU for this problem so it is unique to Pardiso at this point.

Option 3 always works just like 2 and also the Rows sum tests are good for all three test cases. Again the row sum test is "failing" only on the smallest test case - 242 equations.

I am going to attach an updated test program which accepts as input the filename of the test matrix. I have included all three test matrices. Our typical size is NOT 242 equations. That was just the smallest case I could get easily and it produced the error. More typical sizes are a few thousand equations with a fill ratio of around 5 ( these are basically poisson like equations ).

I looked at the cost of leaving out reordeing. It approximately doubles the size of L+U compared to the L+U when using the internal reordering option. The flops go up more than the size of L+U. It appears to be 3X or a little more, But, in these small problems it takes about as long to reorder as to factor so it wouldn't be too much slower as long as we don't use a much finer mesh in these systems that can lead to 40k or 50k equations.

So to summarize, it appears that the matching algorithm is teh real culprit here, not reordering. Turning off the matching algorithm alone fixes the FPE issue in my examples but it does introduce a noticeable error in the solution in one of my test cases.

Gene

gene_poolecd-adapco_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-20-2012
05:35 PM

8 Views

Thanks for the fast response!

You proposed 2 possible work arounds; I tried 3::

1 ) Disable ordering only and use a perm array of int values 1,2, 3...

2) Disable the matching parameter option - iparm[12]=0 ( 0 based index for iparm )

3) Both 1 & 2

It seems that the matching parameter option is the source of the FPE rather than reordering. If fact, If Option 1 above fails with FPE on all of 3 test cases I have. Option 2 always gets me a factorization but there is some issue with the row Sum test for the smallest test case. I think an absolute value of 1e-4 is a bit too high for an error but I also did no rigorous analysis of condition number, etc. I do know that with your reordering enabled the error is much smaller when I do the row sum test. I also have no accuracy issues using SuperLU for this problem so it is unique to Pardiso at this point.

Option 3 always works just like 2 and also the Rows sum tests are good for all three test cases. Again the row sum test is "failing" only on the smallest test case - 242 equations.

I am going to attach an updated test program which accepts as input the filename of the test matrix. I have included all three test matrices. Our typical size is NOT 242 equations. That was just the smallest case I could get easily and it produced the error. More typical sizes are a few thousand equations with a fill ratio of around 5 ( these are basically poisson like equations ).

I looked at the cost of leaving out reordeing. It approximately doubles the size of L+U compared to the L+U when using the internal reordering option. The flops go up more than the size of L+U. It appears to be 3X or a little more, But, in these small problems it takes about as long to reorder as to factor so it wouldn't be too much slower as long as we don't use a much finer mesh in these systems that can lead to 40k or 50k equations.

So to summarize, it appears that the matching algorithm is teh real culprit here, not reordering. Turning off the matching algorithm alone fixes the FPE issue in my examples but it does introduce a noticeable error in the solution in one of my test cases.

Gene

Highlighted
##

Gene,

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-20-2012
07:25 PM

8 Views

Howthis issueis criticalfor you?

Whendo you expectthe solution to thisproblem?

Whendo you expectthe solution to thisproblem?

This is the private thread and you can share this info here.

--Gennady

For more complete information about compiler optimizations, see our Optimization Notice.