<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Ying, in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086398#M22997</link>
    <description>&lt;P&gt;Ying,&lt;/P&gt;

&lt;P&gt;Yesterday we found another case that also fails when using&amp;nbsp;multiple MPI ranks and&amp;nbsp;a single&amp;nbsp;OpenMP thread per rank. It's a larger case than the one provided, but most certainly related to the same bug.&lt;/P&gt;

&lt;P&gt;Initially we thought we could rely on this "multiple MPI ranks, each with only one OpenMP thread" as a workaround. Since this is not the case anymore, we can now only use the solver on a single node, with one MPI rank only (and multiple OpenMP threads). This is obviously&amp;nbsp;a much, much&amp;nbsp;bigger problem for us&amp;nbsp;now since we can't scale anymore beyond one node.&lt;/P&gt;

&lt;P&gt;At this point I would say that for us having this bug fixed ASAP&amp;nbsp;is the highest priority. As mentioned in a private communication with Intel support, we caught this elusive bug just before a product release which has been stopped since then.&lt;/P&gt;</description>
    <pubDate>Thu, 09 Feb 2017 16:53:00 GMT</pubDate>
    <dc:creator>OP1</dc:creator>
    <dc:date>2017-02-09T16:53:00Z</dc:date>
    <item>
      <title>Cluster sparse solver error when using more than 1 MPI process</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086390#M22989</link>
      <description>&lt;P&gt;[Edited title to more accurately reflect the issue at hand]&lt;/P&gt;

&lt;P&gt;Is the build number for the latest Intel MKL version 20161005 ? We are still struggling with a bug that was reported last year, regarding usage of the sparse cluster solver when using more than one rank.&lt;/P&gt;

&lt;P&gt;The release notes for Intel MKL 2017 Update 1 indicate that this bug was fixed, but before we re-open it I'd like to make sure our test (done with 20161005) used the latest and greatest.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Feb 2017 15:46:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086390#M22989</guid>
      <dc:creator>OP1</dc:creator>
      <dc:date>2017-02-02T15:46:35Z</dc:date>
    </item>
    <item>
      <title>On Linux it is 2017.1.132</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086391#M22990</link>
      <description>&lt;P&gt;On Linux it is 2017.1.132&lt;/P&gt;

&lt;P&gt;On Windows it is 2017.1.143&lt;/P&gt;</description>
      <pubDate>Fri, 03 Feb 2017 00:50:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086391#M22990</guid>
      <dc:creator>Jing_Xu</dc:creator>
      <dc:date>2017-02-03T00:50:29Z</dc:date>
    </item>
    <item>
      <title>Hi OP,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086392#M22991</link>
      <description>&lt;P&gt;Hi OP,&lt;/P&gt;

&lt;P&gt;Could you let us know the&amp;nbsp;bug&amp;nbsp;number or other link information ? so we can check the fix information?&lt;/P&gt;

&lt;P&gt;Yes, the latest official release&amp;nbsp;version is MKL 2017 update 1. ( update 2 will be release in the month, you should be able to get notification automatically from system)&lt;/P&gt;

&lt;P&gt;You can check the build data from&amp;nbsp;mkl_version.h file.&lt;/P&gt;

&lt;P&gt;for example under windows:&lt;/P&gt;

&lt;P&gt;#if 0&lt;BR /&gt;
	/*&lt;BR /&gt;
	!&amp;nbsp; Content:&lt;BR /&gt;
	!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Intel(R) Math Kernel Library (MKL) interface&lt;BR /&gt;
	!******************************************************************************/&lt;BR /&gt;
	#endif&lt;/P&gt;

&lt;P&gt;#ifndef _MKL_VERSION_H_&lt;BR /&gt;
	#define _MKL_VERSION_H_&lt;/P&gt;

&lt;P&gt;#define __INTEL_MKL_BUILD_DATE &lt;STRONG&gt;20161006&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;#define __INTEL_MKL__ 2017&lt;BR /&gt;
	#define __INTEL_MKL_MINOR__ 0&lt;BR /&gt;
	#define __INTEL_MKL_UPDATE__ 1&lt;/P&gt;

&lt;P&gt;#define INTEL_MKL_VERSION 20170001&lt;/P&gt;

&lt;P&gt;#endif&lt;/P&gt;

&lt;P&gt;and under Linux :&lt;/P&gt;

&lt;P&gt;#if 0&lt;BR /&gt;
	/*&lt;BR /&gt;
	!&amp;nbsp; Content:&lt;BR /&gt;
	!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Intel(R) Math Kernel Library (MKL) interface&lt;BR /&gt;
	!******************************************************************************/&lt;BR /&gt;
	#endif&lt;/P&gt;

&lt;P&gt;#ifndef _MKL_VERSION_H_&lt;BR /&gt;
	#define _MKL_VERSION_H_&lt;/P&gt;

&lt;P&gt;#define __INTEL_MKL_BUILD_DATE 20161006&lt;/P&gt;

&lt;P&gt;#define __INTEL_MKL__ 2017&lt;BR /&gt;
	#define __INTEL_MKL_MINOR__ 0&lt;BR /&gt;
	#define __INTEL_MKL_UPDATE__ 1&lt;/P&gt;

&lt;P&gt;#define INTEL_MKL_VERSION 20170001&lt;/P&gt;

&lt;P&gt;You get the build data from &amp;nbsp;sparse cluster solver, right? there is a little&amp;nbsp;difference,&amp;nbsp; but&amp;nbsp; the version should be MKL 2017 update 1.&lt;/P&gt;

&lt;P&gt;Best Regards&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;</description>
      <pubDate>Fri, 03 Feb 2017 04:12:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086392#M22991</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2017-02-03T04:12:07Z</dc:date>
    </item>
    <item>
      <title>Here is a reproducer of the</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086393#M22992</link>
      <description>&lt;P&gt;Here is a reproducer of the bug. The sparse cluster solver is used to solve a small Ax=b system. It seems there is an issue with OpenMP threading/MPI interaction.&lt;/P&gt;

&lt;P&gt;The bug will only appear once in a while, so it is critical to create a test matrix (and repeat the test multiple times) for each combination of MPI processes/OpenMP threads. This bug may be related to DPD200588182 that we reported previously and was marked as 'fixed' in the release notes here: &lt;A href="https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2017-bug-fixes-list"&gt;https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2017-bug-fixes-list&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Using MPI_INIT_THREAD() instead of MPI_INIT() helped 'a little' in the sense that the occurrence of the bug is 'reduced'.&lt;/P&gt;

&lt;P&gt;The latest version of MKL is used - see the comment section in the source file below for a description of the commands setting up the environment, as well as the&amp;nbsp;compile and build commands. The two required input files are attached.&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;! The matrix A is real, symmetric positive definite - and only its lower triangle is provided. A right-hand side vector B is
! provided so that the linear system A.X = B can be solved with CLUSTER_SPARSE_SOLVER.
!
! File inputs:
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - The file "file_1.txt" contains the size of the matrix A, its number of nonzero elements, the number of right-hand
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; sides to be analyzed, and the index arrays IA and JA.
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - The file "file_2.txt" contains the matrix values and the right hand sides.
!
! Bug description:
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - The output "MAXVAL(ABS(X))" should be equal to 1.
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - For some combinations of (1) number of MPI processes (ranks) and (2) number of OpenMP threads per rank, the output
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; "MAXVAL(ABS(X))" becomes incorrect.
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - The incorrect results are not always obtained; in fact, the problem shows up randomly.
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; - In order to reproduce the bug, create a test matrix similar to the one below
!
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | Number of
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | OpenMP threads&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; A "PASS" indicates that the correct result was obtained in 15
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Number of MPI&amp;nbsp;&amp;nbsp; | 1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 9&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; consecutive runs of the code. A "FAIL" indicates that at least
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; processes&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; one run was faulty.
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; |-------------------------------
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;gt; FAIL &amp;lt;&amp;nbsp;&amp;nbsp; Note that many of these MPI ranks/OpenMP threads combinations were
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; oversubscribing the computational resources (cores) at hand.
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 7&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 16&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; | pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pass
!
!
!
! Compilation/build/run commands (Linux):
!
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 
! &amp;gt; module load icc/17.0
! &amp;gt; . /opt/intel/compiler17/compilers_and_libraries_2017.1.132/linux/bin/compilervars.sh intel64
! &amp;gt; . /opt/intel/compiler17/compilers_and_libraries_2017.1.132/linux/mkl/bin/mklvars.sh intel64 mod
! &amp;gt; . /opt/intel/compiler17/compilers_and_libraries_2017.1.132/linux/mpi/bin64/mpivars.sh
! &amp;gt; mpiifort -c $MKLROOT/include/mkl_spblas.f90
! &amp;gt; mpiifort -c $MKLROOT/include/mkl_cluster_sparse_solver.f90
! &amp;gt; mpiifort -c MAIN.f90
! &amp;gt; mpiifort -o TEST *.o -L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_lp64
!&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -liomp5 -lpthread -lm -ldl
! &amp;gt; export OMP_NUM_THREADS=9
! &amp;gt; mpirun -n 2 TEST


PROGRAM MAIN
USE MPI
USE MKL_CLUSTER_SPARSE_SOLVER
USE MKL_SPBLAS

IMPLICIT NONE

INTEGER(KIND=4),PARAMETER&amp;nbsp;&amp;nbsp; :: FILE_UNIT_1 = 1000
INTEGER(KIND=4),PARAMETER&amp;nbsp;&amp;nbsp; :: FILE_UNIT_2 = 2000
INTEGER(KIND=4)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :: NUM_PROCS,RANK,IERR,MAXFCT,MNUM,MTYPE,PHASE,MSGLVL,IPARM(64),PERM(1),N,NRHS,NNZ,II
INTEGER(KIND=4)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :: MPI_REQUIRED,MPI_PROVIDED
INTEGER(KIND=4),ALLOCATABLE :: IA(:), JA(:)
REAL(KIND=8),ALLOCATABLE&amp;nbsp;&amp;nbsp;&amp;nbsp; :: A(:),B(:,:),X(:,:)
LOGICAL&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :: MPI_IS_INITIALIZED
CHARACTER(LEN=128)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :: MKL_STRING
TYPE(MKL_CLUSTER_SPARSE_SOLVER_HANDLE) :: PT(64)

MPI_REQUIRED = MPI_THREAD_MULTIPLE
CALL MPI_INIT_THREAD(MPI_REQUIRED,MPI_PROVIDED,IERR)&amp;nbsp; 
CALL MPI_COMM_RANK(MPI_COMM_WORLD,RANK,IERR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NUM_PROCS,IERR)
CALL MPI_BARRIER(MPI_COMM_WORLD, IERR)
CALL MKL_GET_VERSION_STRING(MKL_STRING)
IF (RANK==0) THEN
&amp;nbsp;&amp;nbsp;&amp;nbsp; WRITE(*,*) 'MKL VERSION: ',TRIM(ADJUSTL(MKL_STRING))
&amp;nbsp;&amp;nbsp;&amp;nbsp; WRITE(*,*) 'MPI_REQUIRED = ',MPI_REQUIRED,'; MPI_PROVIDED = ',MPI_PROVIDED
&amp;nbsp;&amp;nbsp;&amp;nbsp; OPEN(FILE_UNIT_1,FILE='file_1.txt',FORM='FORMATTED',STATUS='OLD')
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_1,*) N
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_1,*) NNZ
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_1,*) NRHS
END IF

CALL MPI_BCAST(NRHS,1,MPI_INTEGER,0,MPI_COMM_WORLD,IERR)
CALL MPI_BCAST(NNZ,1,MPI_INTEGER,0,MPI_COMM_WORLD,IERR)
CALL MPI_BCAST(N,1,MPI_INTEGER,0,MPI_COMM_WORLD,IERR)
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
ALLOCATE(IA(N+1),JA(NNZ),A(NNZ),B(N,NRHS),X(N,NRHS))
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
IF (RANK==0) THEN
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_1,*) IA
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_1,*) JA
&amp;nbsp;&amp;nbsp;&amp;nbsp; CLOSE(FILE_UNIT_1)
&amp;nbsp;&amp;nbsp;&amp;nbsp; OPEN(UNIT=FILE_UNIT_2,FILE='file_2.txt',FORM='FORMATTED',STATUS='OLD')
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_2,*) A
&amp;nbsp;&amp;nbsp;&amp;nbsp; READ(FILE_UNIT_2,*) B(:,1)
&amp;nbsp;&amp;nbsp;&amp;nbsp; CLOSE(FILE_UNIT_2)
ENDIF

MAXFCT = 1
MNUM&amp;nbsp;&amp;nbsp; = 1
MTYPE&amp;nbsp; = 2
PERM&amp;nbsp;&amp;nbsp; = 0
MSGLVL = 1

DO II=1,64
&amp;nbsp;&amp;nbsp;&amp;nbsp; PT(II)%DUMMY = 0
ENDDO

PHASE = 13
CALL CLUSTER_SPARSE_SOLVER(PT,MAXFCT,MNUM,MTYPE,PHASE,N,A,IA,JA,PERM,NRHS,IPARM,MSGLVL,B,X,MPI_COMM_WORLD,IERR)

IF (RANK==0) WRITE(*,*) 'MAXVAL(ABS(X)) = ',MAXVAL(ABS(X))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 
CALL MPI_FINALIZE(IERR)
&amp;nbsp;&amp;nbsp;&amp;nbsp; 
END PROGRAM MAIN
&lt;/PRE&gt;

&lt;P&gt;When execution is successful, the output of MAXVAL(ABS(X)) should be 1.0 as shown below:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;&amp;nbsp;MKL VERSION: 
&amp;nbsp;Intel(R) Math Kernel Library Version 2017.0.1 Product Build 20161005 for Intel(
&amp;nbsp;R) 64 architecture applications
&amp;nbsp;MPI_REQUIRED =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3 ; MPI_PROVIDED =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===

Percentage of computed non-zeros for LL^T factorization
&amp;nbsp;14 %&amp;nbsp; 15 %&amp;nbsp; 55 %&amp;nbsp; 93 %&amp;nbsp; 100 % 

=== PARDISO: solving a symmetric positive definite system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Single-level factorization algorithm is turned ON


Summary: ( starting phase is reordering, ending phase is solution )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.008025 s
Time spent in reordering of the initial matrix (reorder)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.002664 s
Time spent in symbolic factorization (symbfct)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.002441 s
Time spent in data preparations for factorization (parlist)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.000041 s
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.153495 s
Time spent in direct solver at solve step (solve)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.000241 s
Time spent in allocation of internal data structures (malloc)&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.008641 s
Time spent in additional calculations&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.002850 s
Total time spent&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.178398 s

Statistics:
===========
Parallel Direct Factorization is running on 9 OpenMP

&amp;lt; Linear system Ax = b &amp;gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of equations:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1992
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in A:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 58290
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in A (%): 1.468978

&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of right-hand sides:&amp;nbsp;&amp;nbsp;&amp;nbsp; 1

&amp;lt; Factors L and U &amp;gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of columns for each panel: 64
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of independent subgraphs:&amp;nbsp; 0
&amp;lt; Preprocessing with state of the art partitioning metis&amp;gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of supernodes:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 385
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; size of largest supernode:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 231
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in L:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 281152
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in U:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in L+U:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 281153
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gflop&amp;nbsp;&amp;nbsp; for the numerical factorization: 0.058924

&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gflop/s for the numerical factorization: 0.383882

&amp;nbsp;MAXVAL(ABS(X)) =&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.00000000000000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&lt;/PRE&gt;

&lt;P&gt;When the output is wrong, MAXVAL(ABS(X)) has some random value, as shown here:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;&amp;nbsp;MKL VERSION: 
&amp;nbsp;Intel(R) Math Kernel Library Version 2017.0.1 Product Build 20161005 for Intel(
&amp;nbsp;R) 64 architecture applications
&amp;nbsp;MPI_REQUIRED =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3 ; MPI_PROVIDED =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3

Percentage of computed non-zeros for LL^T factorization
&amp;nbsp;8 %&amp;nbsp; 20 %&amp;nbsp; 21 %&amp;nbsp; 22 %&amp;nbsp; 23 %&amp;nbsp; 24 %&amp;nbsp; 25 %&amp;nbsp; 26 %&amp;nbsp; 27 %&amp;nbsp; 30 %&amp;nbsp; 34 %&amp;nbsp; 39 %&amp;nbsp; 41 %&amp;nbsp; 57 %&amp;nbsp; 58 %&amp;nbsp; 59 %&amp;nbsp; 64 %&amp;nbsp; 66 %&amp;nbsp; 67 %&amp;nbsp; 68 %&amp;nbsp; 69 %&amp;nbsp; 70 %&amp;nbsp; 71 %&amp;nbsp; 72 %&amp;nbsp; 77 %&amp;nbsp; 80 %&amp;nbsp; 88 %&amp;nbsp; 95 %&amp;nbsp; 99 %&amp;nbsp; 100 % 

=== CPARDISO: solving a symmetric positive definite system ===
1-based array indexing is turned ON
CPARDISO double precision computation is turned ON
Minimum degree algorithm at reorder step is turned ON
Single-level factorization algorithm is turned ON


Summary: ( starting phase is reordering, ending phase is solution )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.053535 s
Time spent in reordering of the initial matrix (reorder)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.010733 s
Time spent in symbolic factorization (symbfct)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.005278 s
Time spent in data preparations for factorization (parlist)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.000036 s
Time spent in copying matrix to internal data structure (A to LU): 0.000005 s
Time spent in factorization step (numfct)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.117436 s
Time spent in direct solver at solve step (solve)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.125628 s
Time spent in allocation of internal data structures (malloc)&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.001863 s
Time spent in additional calculations&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.002628 s
Total time spent&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; : 0.317142 s

Statistics:
===========
Parallel Direct Factorization is running on 2 MPI and 9 OpenMP per MPI process

&amp;lt; Linear system Ax = b &amp;gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of equations:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1992
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in A:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 58290
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in A (%): 1.468978

&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of right-hand sides:&amp;nbsp;&amp;nbsp;&amp;nbsp; 1

&amp;lt; Factors L and U &amp;gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of columns for each panel: 64
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of independent subgraphs:&amp;nbsp; 0
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of supernodes:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 385
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; size of largest supernode:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 231
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in L:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 281152
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in U:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; number of non-zeros in L+U:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 281153
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gflop&amp;nbsp;&amp;nbsp; for the numerical factorization: 0.058924

&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gflop/s for the numerical factorization: 0.501754

&amp;nbsp;MAXVAL(ABS(X)) =&amp;nbsp;&amp;nbsp;&amp;nbsp; 128.577018444593&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 03 Feb 2017 22:44:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086393#M22992</guid>
      <dc:creator>OP1</dc:creator>
      <dc:date>2017-02-03T22:44:16Z</dc:date>
    </item>
    <item>
      <title>In the two outputs posted</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086394#M22993</link>
      <description>&lt;P&gt;In the two outputs posted above, one of the difference is that, in the 'buggy' run, the text "Preprocessing with state of the art partitioning metis" does not appear. Not sure if this helps.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Feb 2017 22:49:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086394#M22993</guid>
      <dc:creator>OP1</dc:creator>
      <dc:date>2017-02-03T22:49:17Z</dc:date>
    </item>
    <item>
      <title>Hi OP,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086395#M22994</link>
      <description>&lt;P&gt;&lt;BR /&gt;
	Hi OP,&lt;/P&gt;

&lt;P&gt;Thanks for the information.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I check the CQ&amp;nbsp; DPD200588182&amp;nbsp; and some related forum discussion were in&amp;nbsp; &lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/675380"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/675380&lt;/A&gt;.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Something seems still wrong where 2017 update 1 has problem under asd machine, but we can't reproduce internally.&amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	And here you mentioned,&amp;nbsp; the incorrect results are not always obtained; in fact, the problem shows up randomly.&lt;BR /&gt;
	We will try to investigate the issues and get back to you if any findings.&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;BR /&gt;
	Ying&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 04 Feb 2017 02:32:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086395#M22994</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2017-02-04T02:32:48Z</dc:date>
    </item>
    <item>
      <title>Hi Ying,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086396#M22995</link>
      <description>&lt;P&gt;Hi Ying,&lt;/P&gt;

&lt;P&gt;The reproducer above may or may not have the same root cause at was identified previously in those other reports. Can you replicate the results that I obtained? It looks like the issue is when more than one OpenMP thread is used.&lt;/P&gt;

&lt;P&gt;If you can't manage to reproduce it (as I said you will need to run the test case multiple times to see it appear), could you indicate what your OMP_xxx, MKL_xxx, KMP_xxx etc. environment variables are?&lt;/P&gt;

&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Mon, 06 Feb 2017 17:24:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086396#M22995</guid>
      <dc:creator>OP1</dc:creator>
      <dc:date>2017-02-06T17:24:00Z</dc:date>
    </item>
    <item>
      <title>Hi OP</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086397#M22996</link>
      <description>&lt;P&gt;Hi OP&lt;BR /&gt;
	&lt;BR /&gt;
	Thank you&amp;nbsp;a lot for the test case.&amp;nbsp;We can reproduce the problem. Just check with you what&amp;nbsp;your&amp;nbsp;expected time schedule for the fix?&lt;BR /&gt;
	&lt;BR /&gt;
	Best Regards,&lt;BR /&gt;
	&lt;BR /&gt;
	Ying&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2017 08:41:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086397#M22996</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2017-02-08T08:41:03Z</dc:date>
    </item>
    <item>
      <title>Ying,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086398#M22997</link>
      <description>&lt;P&gt;Ying,&lt;/P&gt;

&lt;P&gt;Yesterday we found another case that also fails when using&amp;nbsp;multiple MPI ranks and&amp;nbsp;a single&amp;nbsp;OpenMP thread per rank. It's a larger case than the one provided, but most certainly related to the same bug.&lt;/P&gt;

&lt;P&gt;Initially we thought we could rely on this "multiple MPI ranks, each with only one OpenMP thread" as a workaround. Since this is not the case anymore, we can now only use the solver on a single node, with one MPI rank only (and multiple OpenMP threads). This is obviously&amp;nbsp;a much, much&amp;nbsp;bigger problem for us&amp;nbsp;now since we can't scale anymore beyond one node.&lt;/P&gt;

&lt;P&gt;At this point I would say that for us having this bug fixed ASAP&amp;nbsp;is the highest priority. As mentioned in a private communication with Intel support, we caught this elusive bug just before a product release which has been stopped since then.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2017 16:53:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086398#M22997</guid>
      <dc:creator>OP1</dc:creator>
      <dc:date>2017-02-09T16:53:00Z</dc:date>
    </item>
    <item>
      <title>Hi OP,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086399#M22998</link>
      <description>&lt;P&gt;Hi OP,&lt;/P&gt;

&lt;P&gt;Thank you for your reply.. Get it. We will prepare for the fix ASAP and keep you informed&amp;nbsp;if any updates.&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;</description>
      <pubDate>Fri, 10 Feb 2017 08:23:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086399#M22998</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2017-02-10T08:23:13Z</dc:date>
    </item>
    <item>
      <title>Ying - any update regarding</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086400#M22999</link>
      <description>&lt;P&gt;Ying - any update regarding the status of the bug fix? I just tested with 2017 Update 2 and the bug is still there.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Feb 2017 22:12:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086400#M22999</guid>
      <dc:creator>OP1</dc:creator>
      <dc:date>2017-02-23T22:12:10Z</dc:date>
    </item>
    <item>
      <title>I seem to be having a similar</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086401#M23000</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;I seem to be having a similar issue with Cluster Sparse Solver. This is for a much larger matrix (dimension of ~1 million).&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;I have a test where three identical linear system are solved in a row. It always works with 1 MPI, but fails using 4 MPI (&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; font-variant-ligatures: no-common-ligatures;"&gt;mvapich2_ib/2.1)&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;With MKL&amp;nbsp;&lt;SPAN style="font-variant-ligatures: no-common-ligatures;"&gt;11.2.2.164 and 4 MPI:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;It consistently works (1d-16 residual for ||Ax-b||) for the first two call to the solver, but fails (without error - bad residual) for the third call.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;With MKL 2018 u1 and 4 MPI:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;It fails for the&amp;nbsp;&lt;/SPAN&gt;first call to the linear solver with following error.&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;Fatal error in PMPI_Bcast:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;Invalid count, error stack:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;PMPI_Bcast(1635)..........................: MPI_Bcast(buf=0x2aadbe1c9080, count=224415584, MPI_DOUBLE_COMPLEX, root=0, MPI_COMM_WORLD) failed&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;MPIR_Bcast_impl(1471).....................:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;MPIR_Bcast_MV2(3041)......................:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;MPIR_Bcast_intra_MV2(2338)................:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;MPIR_Bcast_scatter_ring_allgather_MV2(733):&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;scatter_for_bcast_MV2(299)................:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;MPIC_Recv(412)............................: Negative count, value is -1601980288&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;Thanks.&lt;/P&gt;

&lt;P style="margin-bottom: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; color: rgb(0, 0, 0);"&gt;&lt;SPAN style="font-variant-ligatures: no-common-ligatures"&gt;--James&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Mar 2018 21:17:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-sparse-solver-error-when-using-more-than-1-MPI-process/m-p/1086401#M23000</guid>
      <dc:creator>kestyn__james</dc:creator>
      <dc:date>2018-03-05T21:17:12Z</dc:date>
    </item>
  </channel>
</rss>

