- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using PARDISO solver of mkl 10.0. I set default parameters and mtype=-2. Each time I run my program, PARDISO gives slightly different results while I think the results must be identical. For example
8.007956138556888E-4 vs.
8.007956138556876E-4
or
6.178758439680843E-4 vs.
6.178758439680833E-4
Could you give me reasons why this slight difference occurs?
Thank you in advance.
Thanh
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my application it makes PARDISO difficult to use. The changes in results are within the expected solution tolerance but since PARDISO cannot tell us the expected tolerance, we have use another solver to check! Hence PARDISO is redundant. I am still stuck using the old IMSL sparse solver.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We were burnt by this issue (not by PARDISO but by other solvers in MKL) and had to resort to making sure that the array partitions are aligned correctly.
We have found mkl documenation not always helpful; however the array alignment issue in particular, has been discussed in chapter 6 of the current (mkl with compiler 11.1.065) user's guide.
On a different note, I think that if the difference of the order of absolute tolerance (for example, by rule of thumb 1e-14 when using Real(8) numbers, should not affect the results from an engineering point of view. If it does, it indicates that the system of equations is badly scaled or you need to rewrite the equations or you may want to introduce stringent flags, such as fp:precise, for floating pointer operations. We did find such places in our programs and are more watchful since. I suspect dynamic parallelization will place even more demands on writing programs.
Abhi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do we supply 16 byte aligned workspace to PARDISO?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
However, We used a brute force approach i.e. we allocated N+1 elements and pass section 1:N or 2:N+1 as required. We thus use intrinsic function LOC and mod to see if the 16 byte alignment is there or not.
Abhi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think you meant "in Intel Fortran". We were suspecting/fearful of similar issues with libraries on other "odd" platform we support and hence did not use any directives.
Abhi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Then I tried creating local copies of the data using mkl_malloc but the array descriptor of my local variable still said it had only one element and assigning the copy failed. e.g.
integer p
real*8 d(1)
pointer (p, d)
p = mkl_malloc(1000, 16)
d = myComponent%d
!Now d is only assigned the firat value
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In Intel Fortran, one uses
!DEC$ ATTRIBUTES ALIGN:16 :: varname
to specify the base alignment of individual variables. These can be local variables or ALLOCATABLE. For example, to declare an array of REAL(8) so that the base is aligned 16-bytes, one could use:
REAL(8), ALLOCATABLE :: X(:)
!DEC$ ATTRIBUTES ALIGN:16 X
ALLOCATE (X(10000))
This assumes you use Fortran ALLOCATE - using other allocation methods, such as mkl_malloc, are not affected by the directive.
Andrew, in the source you provide, there is no way for the compiler to know how big d is. You declared it as one element (better would probably be to use (*)). Assuming you had done so, you could perhaps have written:
d(1:1000) = myComponent%d
but it would be far better if you made d an ALLOCATABLE array and allocated it to the proper size. Then everything works out for size and, if you choose, alignment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did this for all real and integer variables. I used stack variables rather than allocatable but I confirmed 16 byte alignment.
I did not do alignment for the control variable: type(MKL_PARDISO_HANDLE) :: mklMemoryPointers(64)
Then I do not get the same answers run to run for PARDISO with multithreading.
Then I also copied the mklMemoryPointers to a local aligned variable and still no luck.
Anybody else got this to work?
I could develop a test program with input data files if that would help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The MKL PARDISO Version is not able to compute the same answers with multithreading. Our new version PARDISO 4.0 from www.pardiso-project.org is the only parallel sparse direct solver that has this kind of functionality. Please take a look at the webpage. This option is currently only available for symmetric indefinite matrices and we hope to add this feature for all matrix types.
Olaf Schenk
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is a completely redesign of the scheduling in the numerical factorzation. The aim was to improve threading scalabilty compared to PARDISO 3.3, and at the same time, we wanted to have identical results on multicores.
>And are youcalling only sequential MKL BLAS from within your algorithm?
We are using sequential MKL BLAS for each thread because multi-threaded MKL BLAS can not compute identical results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is just a quote related to IEEE 754:
Floating point arithmetic is not associative. This means that in general for floating point numbers x, y, and z: In mathematics, associativity is a property that a binary operation can have. ...
(x + y) + z != x + (y + z)
(x * y) * z!= x * (y * z)
Floating point arithmetic is also not distributive. This means that in general: In mathematics, and in particular in abstract algebra, distributivity is a property of binary operations that generalises the distributive law from elementary algebra. ...
x * (y + z)!= (x * y) + (x * z)
In short, the order in which operations are carried out can change the output of a floating point calculation. This is important in numerical analysis since two mathematically equivalent formulas may not produce the same numerical output, and one may be substantially more accurate than the other.
Thus, bit-to-bit results can be computed if and only if all operations are executed in predefined order. It means that compiler must not change this order doing any optimizations.
Also its important to say, that multithreaded calculations cannot guarantee order of calculations because mostly they are dynamically scheduled regardless of each thread has ordered calculations. BTW, OpenMP cannot guarantee that parallel section will use the same team of threads on the second run.
At last, Id add the following statement from What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg, published in the March, 1991 issue of Computing Surveys.
Many programmers like to believe that they can understand the behavior of a program and prove that it will work correctly without reference to the compiler that compiles it or the computer that runs it. In many ways, supporting this belief is a worthwhile goal for the designers of computer systems and programming languages.
In my opinion, instead of requiring identical results of calculations every application should try to prove that used algorithm is stable, calculated deviation is validated and estimated to be sure that results are correct.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page