- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I suspect this is a problem with my code, but I can't figure out what the issue is...
I've been using similar code for a while now, and its worked fine until I installed 16.0, however it now hangs in the call to Pardiso with phase 33. Basically I import a matrix, factorize then solve in an OMP parallel loop for many RHS. Because of the layout of my main program (I provide a much stripped down version here, as well as a data file), the same temporary array is used by each thread, and the solution vector overwrites the right hand side. I'm unsure if Pardiso actually uses this temporary array - but in case it does I place the call to Pardiso within an OMP critical section to prevent multiple threads accessing it at the same time.
I compile with parallel OMP (/Qopenmp) and parallel MKL (/Qmkl:parallel) flags, otherwise everything is as Visual Studio sets it by default.
If OMP is set to one thread it runs fine. If I comment out the OMP parallel directive on the loop it also seems to run fine. However on my 8 (4 with hyper threading) core machine when running in parallel no joy.
If anyone has any suggestions/thoughts/solutions it would be appreciated,
Thanks,
Michael
include 'mkl_pardiso.f90' program TestPardiso2 use OMP_LIB use MKL_PARDISO implicit none double precision, allocatable :: A(:), toSolve(:,:) integer, allocatable :: ia(:), ja(:) integer :: N integer, parameter :: numberToSolve = 20 ! Read matrix from file call GetMatrixFromFile("Output.txt", N, ia, ja, A) ! Invent some data to solve allocate(toSolve(N,numberToSolve)) call RANDOM_NUMBER(toSolve) ! Solve using Pardiso call SolveWithPardiso(N, ia, ja, A, numberToSolve, toSolve) contains subroutine GetMatrixFromFile(name, N, ia, ja, A) character(len=*), intent(in) :: name double precision, intent(out), allocatable :: A(:) integer, intent(out), allocatable :: ia(:), ja(:) integer, intent(out) :: N character(len=255) :: buffer integer :: nnz, status, iVal, prevIval, cnt ! open file open(UNIT=21, FILE=name, STATUS="OLD", IOSTAT=status) ! Get matrix size from first lines of file read(UNIT=21, FMT='(A)') buffer ! Ignore title read(UNIT=21, FMT='(A5, I)') buffer, N read(UNIT=21, FMT='(A7, I)') buffer, nnz read(UNIT=21, FMT='(A)') buffer ! Ignore column heading read(UNIT=21, FMT='(A)') buffer ! Ignore space ! Allocate matrix - assume no errors! allocate(ia(N + 1), ja(nnz), A(nnz)) ! Loop thru file. Assume only end of file error will occur status = 0 cnt = 0 prevIval = 0 do while (status == 0) cnt = cnt + 1 read(UNIT=21, FMT='(X, I, X, I, X, E)', IOSTAT=status) ival, ja(cnt), A(cnt) if ( ival /= prevIval ) then ia(ival) = cnt prevIval = ival end if end do ia(N + 1) = nnz + 1 end subroutine GetMatrixFromFile subroutine SolveWithPardiso(N, ia, ja, A, numberToSolve, toSolve) integer, intent(in) :: N, numberToSolve integer, intent(in) :: ia(:), ja(:) double precision, intent(in) :: A(:) double precision, intent(inout) :: toSolve(:,:) type(MKL_PARDISO_HANDLE) :: pt(64) integer :: param(64), perm(N), error, i double precision:: tmpArray(N) ! Initialize Pardiso options CALL pardisoinit(pt, 2, param) param( 6) = 1 ! Solver stores the solution in the right-hand side b. param(27) = 1 ! Check input data ! call omp_set_num_threads(1) ! Uncommenting this line Works ! Solve call pardiso(pt, 1, 1, 2, 12, N, A, ia, ja, perm, 1, param, 1, toSolve(:, 1), tmpArray, error) !$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(i) DO i = 1, numberToSolve WRITE(*,*) (omp_get_thread_num() + 1), " : at critical" !$OMP CRITICAL (criticalPardisoSection2109) WRITE(*,*) (omp_get_thread_num() + 1), " : solving" ! Solve call pardiso(pt, 1, 1, 2, 33, N, A, ia, ja, perm, 1, param, 1, toSolve(:, i), tmpArray, error) WRITE(*,*) (omp_get_thread_num() + 1), " : complete" !$OMP END CRITICAL (criticalPardisoSection2109) END DO !$OMP END PARALLEL DO end subroutine SolveWithPardiso end program TestPardiso2
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Michael, I couldn't reproduce the issue. my environment -- win8, 64 bit, threading linking, 4 threads. MKL version 11.3.0. the log file is attached.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Michael, sorry something wrong with attachment. please see below the some part of this log below.
all 20 iterations were completed successfully:
Intel(R) Math Kernel Library Version 11.3.0 Product Build 20150730 for Intel(R) 64 architecture applications
1 : at critical
1 : solving, num of iteration == 1
1 : complete
1 : at critical
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
..................................
.................................
solving, num of iteration == 20
1 : complete
==== PASSED ====
number of non-zeros in U: 1
number of non-zeros in L+U: 1078366
gflop for the numerical factorization: 0.189263
gflop/s for the numerical factorization: 9.107600
=== PARDISO: solving a symmetric positive definite system ===
Summary: ( solution phase )
================
Times:
======
Time spent in direct solver at solve step (solve) : 0.003051 s
Time spent in additional calculations : 0.000647 s
Total time spent : 0.003698 s
Statistics:
===========
Parallel Direct Factorization is running on 2 OpenMP
< Linear system Ax = b >
number of equations: 14190
number of non-zeros in A: 167808
number of non-zeros in A (%): 0.083339
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 80
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 3594
size of largest supernode: 370
number of non-zeros in L: 1078365
number of non-zeros in U: 1
number of non-zeros in L+U: 1078366
gflop for the numerical factorization: 0.189263
gflop/s for the numerical factorization: 9.107600
=== PARDISO: solving a symmetric positive definite system ===
Summary: ( solution phase )
================
Times:
======
Time spent in direct solver at solve step (solve) : 0.003532 s
Time spent in additional calculations : 0.000626 s
Total time spent : 0.004158 s
Statistics:
===========
Parallel Direct Factorization is running on 2 OpenMP
< Linear system Ax = b >
number of equations: 14190
number of non-zeros in A: 167808
number of non-zeros in A (%): 0.083339
number of right-hand sides: 1
< Factors L and U >
number of columns for each panel: 80
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 3594
size of largest supernode: 370
number of non-zeros in L: 1078365
number of non-zeros in U: 1
number of non-zeros in L+U: 1078366
gflop for the numerical factorization: 0.189263
gflop/s for the numerical factorization: 9.107600
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gennady,
Thanks for taking the time to look into this. Can you confirm OMP directives are being honoured (compiling with /Qmkl:parallel)? I would expect to see output from other threads being blocked at the critical section; something like:
1 : at critical
2 : at critical
3 : at critical
In your output I only see thread 1?
Thanks again,
Michael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page