Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

ericcp_dias_ie

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2014
03:35 AM

147 Views

MKL_PARDISO in os x

Hello, I am trying to solve a sparse linear system in fortran 90 using the mkl pardiso solver as in the following code:

do i = 1, 64 iparm(i) = 0 end do error = 0 ! initialize error flag msglvl = 1 ! print statistical information mtype = 13 ! complex unsymmetric !C.. Initiliaze the internal solver memory pointer. This is only !C necessary for the FIRST call of the PARDISO solver. do i = 1, 64 pt( i )%DUMMY = 0 end do maxfct = 1 mnum = 1 nrhs = 1 !C.. Reordering and Symbolic Factorization, This step also allocates !C all memory that is necessary for the factorization phase = 11 ! only reordering and symbolic factorization print*, ' calling sparse solver' CALL PARDISO (pt, maxfct, mnum, mtype, phase, nPardiso, values, rowIndex, columns, & perm, nrhs, iparm, msglvl, b, Ex, error) WRITE(*,*) 'Reordering completed ... '

My code compiles (intel compiler ifort (IFORT) 12.1.3 20120130) without problem but once reached the solver call the program stops and I have to kill it.

The output on my screen is, before the program behave strangely, just

calling sparse solver

I've also tried with different parameters in iparm, as per the examples folder, but without different results.

I also print out the matrix, to be sure is in the correct form.

double complex values:(-1.000000,0.0000000E+00) (1.000000,0.0000000E+00) (0.5000000,0.0000000E+00)

(-1.500000,0.0000000E+00) (1.000000,0.0000000E+00) (-0.2222222,0.0000000E+00)

(-0.1111111,0.0000000E+00) (0.3333333,0.0000000E+00) (1.000000,0.0000000E+00)

integer rowIndex: 1 3 6 9 10

integer columns: 1 2 1 2 3 2 3 4 4

b is imaginary and the system has a solution.

Any suggestion?

1 Solution

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-27-2014
09:05 AM

147 Views

I think that you have errors related to a misunderstanding of the nature of the intrinsic function CMPLX. Because kind(cmplx(1.0d0,1.0d0)) evaluates to 4, whereas kind(cmplx(1.0d0,1.0d0,kind=kind(1d0))) evaluates to 8, the latter of which you probably want, many errors arise. Note, as well, that Pardiso will not work correctly with *default precision* real and complex variables unless you set iparm(28) = 1. You need REAL(8) and COMPLEX(8) variables if you set the default value of zero in iparm(28).

Try the following program to see the implications:

[fortran]

program chkcmplx

implicit none

complex(8) :: i1

i1=cmplx(0d0,1d-1)

write(*,*)i1

i1=cmplx(0d0,1d-1,kind(0d0))

write(*,*)i1

write(*,*)kind(cmplx(1d0,1d0)),kind(cmplx(1.0d0,1.0d0,kind=kind(1d0)))

end [/fortran]

I have attached a zip file with corrected sources. The program built with these sources worked correctly with IFort 11.1.070 and 14.0.2, on Windows

Link Copied

8 Replies

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2014
05:09 AM

147 Views

into the code you show - all iparm are zero. Would you please give the example which we can compile and check the problem on our side?

please use iparm(27) = 1 to check the input matrix format.

ericcp_dias_ie

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2014
06:18 AM

147 Views

It is a bit long, I've put everything here include 'mkl_pardiso.f90' module d1isoModel use kinddef USE mkl_pardiso implicit none type :: d1isoGrid private real(rk), dimension(:), allocatable :: z real(rk), dimension(:), allocatable :: sigma integer(ik):: n contains procedure, public :: init => initd1isoModel procedure, public :: printOnScreen => printd1isoModelOnScreen procedure, public :: computeEx procedure, public :: deallocate => clean_d1isoGrid end type d1isoGrid contains subroutine initd1isoModel(this,n,z,sigma) class(d1isoGrid) :: this integer(ik) :: n real(rk), dimension(n) :: z real(rk), dimension(n-1) :: sigma allocate(this%z(n)) allocate(this%sigma(n-1)) this%z = z this%sigma = sigma this%n = n end subroutine initd1isoModel subroutine clean_d1isoGrid(this) class(d1isoGrid) :: this if (allocated(this%sigma))then deallocate(this%sigma) end if if (allocated(this%z))then deallocate(this%z) end if print*, 'memory used for d1isoGrid relased ...' end subroutine clean_d1isoGrid function computeEx(this,freq,Ex) result(info) use physicsparameters class(d1isoGrid), intent(in) :: this real(rk), intent(in) :: freq complex(ck), dimension(:), allocatable :: Ex integer :: info ! pardiso header file, parameters and variables, based on the intel example TYPE(MKL_PARDISO_HANDLE), dimension(64) :: pt INTEGER :: iparm(64) INTEGER :: i, idum complex(rk) :: ddum!, waltime1, waltime2 INTEGER :: maxfct, mnum, mtype, phase, nPardiso, nrhs, error, msglvl ! local arguments! real(rk) :: omega real(rk), dimension(this%n -3) :: znodes, C integer :: j ! integer (ik) :: icol, irow integer :: n integer :: elementIndex real(rk) :: zGhostNode complex (ck), dimension ((this%n -3)*3 + 2 + 1) :: values integer , dimension ((this%n -3)*3 + 2 + 1) :: columns integer , dimension (this%n) :: rowIndex complex (ck), dimension(this%n-1) :: b integer, dimension(this%n -1) :: perm n = this%n nPardiso = n-1 ! compute znodes coordiantes do i = 2,n znodes(i-1) = (this%z(i) + this%z(i-1))*0.5_rk end do omega = 2*pi*freq allocate (Ex(this%n -1)) ! columns store respectively Ex, Ey and Ez. b = cmplx(0._rk,0._rk) b(1) = omega*mu0*i1 C = computeC(n,this%z,znodes) !print*, znodes !print*, C zGhostNode = znodes(1) - (znodes(2) - znodes(1)) ! print*, zGhostNode, znodes print*, 'forming the finite-differences sparse matrix and r.h.s. ...' values(1) = -((znodes(1) - zGhostNode)**(-1)) columns(1) = 1 rowIndex(1) = 1 values(2) = ((znodes(1) - zGhostNode)**(-1)) columns(2) = 2 elementIndex = 2 !print*, 'n = ', n do i = 2,n-2 do j = -1,1 elementIndex = elementIndex+1 select case (j) case(-1) values(elementIndex) = C(i-1)*(znodes(i+1)-znodes(i)) !print*, c(i), values(elementIndex) columns(elementIndex) = i+j rowIndex(i) = elementIndex case(0) values(elementIndex) = -C(i-1)*(znodes(i+1)-znodes(i-1)) columns(elementIndex) = i+j case(1) values(elementIndex) = C(i-1)*(znodes(i)-znodes(i-1)) columns(elementIndex) = i+j end select end do end do elementIndex = elementIndex+1 values(elementIndex) = 1._rk columns(elementIndex) = n-1 rowIndex(n-1) = elementIndex rowIndex(n) = size(values)+1 print*, values print*, columns print*, rowIndex ! calling the sparse solver !C.. !C.. Set up PARDISO control parameter !C.. do i = 1, 64 iparm(i) = 0 end do !!$ iparm(1) = 0 ! no solver default !!$ iparm(2) = 2 ! uses openMP !!$ iparm(3) = 0 ! numbers of processors !!$ iparm(4) = 62 ! no iterative-direct algorithm !!$ iparm(5) = 0 ! no user fill-in reducing permutation !!$ iparm(6) = 0 ! =0 solution on the first n compoments of x !!$ iparm(7) = 0 ! not in use !!$ iparm(8) = 0 ! numbers of iterative refinement steps !!$ iparm(9) = 0 ! not in use !!$ iparm(10) = 13 ! perturbe the pivot elements with 1E-13 !!$ iparm(11) = 1 ! use nonsymmetric permutation and scaling MPS !!$ iparm(12) = 0 ! not in use !!$ iparm(13) = 1 ! not in use !!$ iparm(14) = 0 ! Output: number of perturbed pivots !!$ iparm(15) = 0 ! not in use !!$ iparm(16) = 0 ! not in use !!$ iparm(17) = 0 ! not in use !!$ iparm(18) = -1 ! Output: number of nonzeros in the factor LU !!$ iparm(19) = -1 ! Output: Mflops for LU factorization !!$ iparm(20) = 0 ! Output: Numbers of CG Iterations iparm(27) = 1 error = 0 ! initialize error flag msglvl = 1 ! print statistical information mtype = 13 ! complex unsymmetric !C.. Initiliaze the internal solver memory pointer. This is only !C necessary for the FIRST call of the PARDISO solver. do i = 1, 64 pt( i )%DUMMY = 0 end do maxfct = 1 mnum = 1 nrhs = 1 !C.. Reordering and Symbolic Factorization, This step also allocates !C all memory that is necessary for the factorization phase = 11 ! only reordering and symbolic factorization print*, ' calling sparse solver' CALL PARDISO (pt, maxfct, mnum, mtype, phase, nPardiso, values, rowIndex, columns, & perm, nrhs, iparm, msglvl, b, Ex, error) WRITE(*,*) 'Reordering completed ... ' IF (error .NE. 0) THEN WRITE(*,*) 'The following ERROR was detected: ', error STOP END IF WRITE(*,*) 'Number of nonzeros in factors = ',iparm(18) WRITE(*,*) 'Number of factorization MFLOPS = ',iparm(19) !C.. Factorization. phase = 22 ! only factorization CALL pardiso (pt, maxfct, mnum, mtype, phase, nPardiso, values, rowIndex, columns & , perm, nrhs, iparm, msglvl, b, Ex, error) WRITE(*,*) 'Factorization completed ... ' IF (error .NE. 0) THEN WRITE(*,*) 'The following ERROR was detected: ', error STOP ENDIF !C.. Back substitution and iterative refinement iparm(8) = 2 ! max numbers of iterative refinement steps phase = 33 ! only factorization CALL pardiso (pt, maxfct, mnum, mtype, phase, nPardiso, values, rowIndex, columns & , perm, nrhs, iparm, msglvl, b, Ex, error) WRITE(*,*) 'Solve completed ... ' WRITE(*,*) 'The solution of the system is ' DO i = 1, n WRITE(*,*) ' Ex(',i,') = ', Ex(i) END DO !C.. Termination and release of memory phase = -1 ! release internal memory CALL pardiso (pt, maxfct, mnum, mtype, phase, nPardiso, values, rowIndex, columns & , perm, nrhs, iparm, msglvl, b, Ex, error) info = error return end function computeEx function computeC(n,z,znodes) result(C) integer(ik), intent(in) :: n real(rk), dimension(n) :: z real(rk), dimension(n-1) :: znodes real(rk), dimension(2:n-2) :: C integer(ik) :: i !C = 0._rk !print*, 'c =' !print*, C do i = 2,n-2 C(i) = (z(i+1)-z(i))**(-1) * ((znodes(i+1)-znodes(i))*(znodes(i)-znodes(i-1)))**(-1) end do !print*, 'c =' !print*, C return end function computeC end module d1isoModel

and the supporting modules

module kinddef implicit none integer, parameter :: ik = kind(1) integer, parameter :: rk = kind(1.0d0) integer, parameter :: ck = kind(cmplx(1.0d0,1.0d0)) save end module

and

module physicsparameters use kinddef implicit none complex(ck), parameter :: i1 = cmplx(0._rk,1._rk) real(rk), parameter :: pi = acos(-1._rk) real(rk), parameter :: mu0 = 4._rk*pi*1.E-7 save end module physicsparameters

I call everything from main

program hello use kinddef use d1isoModel implicit none type(d1isoGrid) :: myGrid real(rk), dimension(4) :: s real(rk), dimension(5) :: z complex(ck), dimension(:), allocatable :: E integer(ik) :: info z = (/-1., 0., 1., 2., 4./) s = (/0., 0.001, .001, .1/) call myGrid%init(5,z,s) call myGrid%printOnScreen() info = computeEx(myGrid,3._rk,E) print*, 'info = ', info call myGrid%deallocate() deallocate(E) end program

I use the flags

ifort -o test *.o -i8 -openmp -I$MKLROOT/include $MKLROOT/lib/libmkl_intel_ilp64.a $MKLROOT/lib/libmkl_core.a $MKLROOT/lib/libmkl_intel_thread.a -lpthread -l

to link

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2014
09:46 AM

147 Views

Incomplete, can't compile because source code of printd1isoModelOnScreen is not provided.

ericcp_dias_ie

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2014
09:52 AM

147 Views

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2014
06:34 PM

147 Views

I found a couple of array bounds errors in your program. In Function ComputeEx, you need dimension znodes(this%n-1) rather than (this%n-3).

Secondly, I am not familiar with the F95 interface to Pardiso, so I replaced the F95 style calls with F77 calls. The changes can be found in the attached file. With these changes, your program ran to completion, with the solution

[bash]

Ex( 1 ) = (0.000000000000000E+000,-8.290467793798273E-005)

Ex( 2 ) = (0.000000000000000E+000,-5.921762709855910E-005)

Ex( 3 ) = (0.000000000000000E+000,-3.553057625913546E-005)

Ex( 4 ) = (0.000000000000000E+000,0.000000000000000E+000)

[/bash]

I think that if my assessment is correct, you can simplify the program by avoiding complex numbers. The assessment is based on observing that the matrix is real, and the R.H.S. vector b is purely imaginary. Therefore, instead of solving A x = b = i b_{r}, you can solve A y = b_{r}, and then take the solution as x = i y. If you can take advantage of this observation, you may pick up considerable improvement in execution speed.

ericcp_dias_ie

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-27-2014
06:44 AM

147 Views

I had some improvements on my system using the provided file.

At the moment I can compile and run the code without interruption if I link the libraries dynamically.

Nevertheless I obtain a sad set of (0., NaN) result.

ericmacbook:d1iso eric$ ./test forming the finite-differences sparse matrix and r.h.s. ... (-1.000000,0.0000000E+00) (1.000000,0.0000000E+00) (1.000000,0.0000000E+00) (-2.000000,0.0000000E+00) (1.000000,0.0000000E+00) (1.000000,0.0000000E+00) (-1.666667,0.0000000E+00) (0.6666667,0.0000000E+00) (1.000000,0.0000000E+00) 1 2 1 2 3 2 3 4 4 1 3 6 9 10 calling sparse solver === PARDISO: solving a complex nonsymetric system === The local (internal) PARDISO version is : 103911000 1-based array indexing is turned ON PARDISO double precision computation is turned ON METIS algorithm at reorder step is turned ON Scaling is turned ON Matching is turned ON Summary: ( reordering phase ) ================ Times: ====== Time spent in calculations of symmetric matrix portrait (fulladj): 0.000010 s Time spent in reordering of the initial matrix (reorder) : 0.000123 s Time spent in symbolic factorization (symbfct) : 0.000026 s Time spent in allocation of internal data structures (malloc) : 0.000267 s Time spent in additional calculations : 0.000017 s Total time spent : 0.000443 s Statistics: =========== < Parallel Direct Factorization with number of processors: > 1 < Numerical Factorization with Level-3 BLAS performance > < Linear system Ax = b > number of equations: 4 number of non-zeros in A: 9 number of non-zeros in A (): 56.250000 number of right-hand sides: 1 < Factors L and U > number of columns for each panel: 128 number of independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> number of supernodes: 2 size of largest supernode: 2 number of non-zeros in L: 10 number of non-zeros in U: 2 number of non-zeros in L+U: 12 Reordering completed ... Number of nonzeros in factors = 12 Number of factorization MFLOPS = 0 === PARDISO is running in In-Core mode, because iparam(60)=0 === Percentage of computed non-zeros for LL^T factorization 50 % 90 % 100 % === PARDISO: solving a complex nonsymetric system === Single-level factorization algorithm is turned ON Summary: ( factorization phase ) ================ Times: ====== Time spent in copying matrix to internal data structure (A to LU): 0.000000 s Time spent in factorization step (numfct) : 0.001583 s Time spent in allocation of internal data structures (malloc) : 0.000015 s Time spent in additional calculations : 0.000002 s Total time spent : 0.001600 s Statistics: =========== < Parallel Direct Factorization with number of processors: > 1 < Numerical Factorization with Level-3 BLAS performance > < Linear system Ax = b > number of equations: 4 number of non-zeros in A: 9 number of non-zeros in A (): 56.250000 number of right-hand sides: 1 < Factors L and U > number of columns for each panel: 128 number of independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> number of supernodes: 2 size of largest supernode: 2 number of non-zeros in L: 10 number of non-zeros in U: 2 number of non-zeros in L+U: 12 gflop for the numerical factorization: 0.000000 gflop/s for the numerical factorization: 0.000051 Factorization completed ... === PARDISO: solving a complex nonsymetric system === Summary: ( solution phase ) ================ Times: ====== Time spent in direct solver at solve step (solve) : 0.000097 s Time spent in additional calculations : 0.000038 s Total time spent : 0.000135 s Statistics: =========== < Parallel Direct Factorization with number of processors: > 1 < Numerical Factorization with Level-3 BLAS performance > < Linear system Ax = b > number of equations: 4 number of non-zeros in A: 9 number of non-zeros in A (): 56.250000 number of right-hand sides: 1 < Factors L and U > number of columns for each panel: 128 number of independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> number of supernodes: 2 size of largest supernode: 2 number of non-zeros in L: 10 number of non-zeros in U: 2 number of non-zeros in L+U: 12 gflop for the numerical factorization: 0.000000 gflop/s for the numerical factorization: 0.000051 Solve completed ... The solution of the system is Ex( 1 ) = (0.0000000E+00,NaN) Ex( 2 ) = (0.0000000E+00,NaN) Ex( 3 ) = (0.0000000E+00,NaN) Ex( 4 ) = (0.0000000E+00,NaN) info = 0 memory used for d1isoGrid relased ... ericmacbook:d1iso eric$

maybe you know any smart debugging flag I can use to individuate/solve the problem?

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-27-2014
09:05 AM

148 Views

I think that you have errors related to a misunderstanding of the nature of the intrinsic function CMPLX. Because kind(cmplx(1.0d0,1.0d0)) evaluates to 4, whereas kind(cmplx(1.0d0,1.0d0,kind=kind(1d0))) evaluates to 8, the latter of which you probably want, many errors arise. Note, as well, that Pardiso will not work correctly with *default precision* real and complex variables unless you set iparm(28) = 1. You need REAL(8) and COMPLEX(8) variables if you set the default value of zero in iparm(28).

Try the following program to see the implications:

[fortran]

program chkcmplx

implicit none

complex(8) :: i1

i1=cmplx(0d0,1d-1)

write(*,*)i1

i1=cmplx(0d0,1d-1,kind(0d0))

write(*,*)i1

write(*,*)kind(cmplx(1d0,1d0)),kind(cmplx(1.0d0,1.0d0,kind=kind(1d0)))

end [/fortran]

I have attached a zip file with corrected sources. The program built with these sources worked correctly with IFort 11.1.070 and 14.0.2, on Windows

ericcp_dias_ie

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-27-2014
09:34 AM

147 Views

oh, that was enlightening!

Thanks a million for your kind help.

As per the real vs complex you are right again, but both the matrix and the rhs doom is to become complex, so I will not be able to separate them again. If i'll be possible to solve independently for real and imaginary part, I'll keep your tip in mind.

Thanks again.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.