Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

pardiso_setenv crashes pardiso

xian-zhong_guous_cd-
632 Views

I have a simple example as follows:

 int n = 5;
 std::vector<int> ia(6);
 for(unsigned i=0;i<ia.size();_i++)ia=_i;
  std::vector<int> ja(5);
  for(unsigned _=0_i<_ja.size();i++)ja[_i]=_i;

 std::vector<double> a(5,1.);

If I call padrsio_setenv to set PARDISO_OOC_FILE_NAME, it crashes during the numerical factorization. However, if I use a different example,

int n = 8;
  int nrhs = 2;
  int _ia[9] = { 1, 5, 8, 10, 12, 15, 17, 18, 19};
  int _ja[18] =
  { 1,    3,       6, 7,
      2, 3,    5,
      3,             8,
      4,       7,
      5, 6, 7,
      6,    8,
      7,
      8
  };
  for(int i = 0; i <9; i++) _ia--;
  for(int i = 0; i<18; i++) _ja--;
  double _a[18] =
  { 7.0,      1.0,           2.0, 7.0,
      4.0, 8.0,      2.0,
      1.0,                     5.0,
      7.0,           9.0,
      5.0, 1.0, 5.0,
      1.0,      5.0,
      11.0,
      5.0
  };

std::vector<int> ia (_ia, _ia + sizeof(_ia) / sizeof(int) );
  std::vector<int> ja (_ja, _ja + sizeof(_ja) / sizeof(int) );
  std::vector<double> a (_a, _a + sizeof(_a) / sizeof(double) );

It works fine. Here is how I call pardiso:

for (int i = 0; i < 64; i++) {
    iparm = 0;
  }
  iparm[0] = 1; /* No solver default */
  iparm[1] = 2; /* Fill-in reordering from METIS */
  /* Numbers of processors, value of OMP_NUM_THREADS */
  iparm[2] = 1;
  iparm[3] = 0; /* No iterative-direct algorithm */
  iparm[4] = 0; /* No user fill-in reducing permutation */
  iparm[5] = 0; /* Write solution into x */
  iparm[6] = 0; /* Not in use */
  //iparm[7] = 0; /* Max numbers of iterative refinement steps */
  iparm[8] = 0; /* Not in use */
  iparm[9] = 13; /* Perturb the pivot elements with 1E-13 */
  iparm[10] = 1; /* Use nonsymmetric permutation and scaling MPS */
  iparm[11] = 0; /* Not in use */
  iparm[12] = 1; /* Maximum weighted matching algorithm is switched-on (default for non-symmetric) */
  iparm[13] = 0; /* Output: Number of perturbed pivots */
  iparm[14] = 0; /* Not in use */
  iparm[15] = 0; /* Not in use */
  iparm[16] = 0; /* Not in use */
  iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
  iparm[18] = -1; /* Output: Mflops for LU factorization */
  iparm[19] = 0; /* Output: Numbers of CG Iterations */
  iparm[27] = 0;
  iparm[34] = 1;
  iparm[59] = 1; // 0: in-core; 1: in-core first; then OOC if not enough memory; 2: ooc

  maxfct = 1; /* Maximum number of numerical factorizations. */
  mnum = 1; /* Which factorization to use. */
  error = 0; /* Initialize error flag */

for (int i = 0; i < 64; i++) {
    pt = 0;
  }

std::string fname = this->tmpDir + "/" + ooc_prefix;
  PARDISO_ENV_PARAM param = PARDISO_OOC_FILE_NAME;
  pardiso_setenv(pt, &param, fname.c_str());

phase = 11;

callPARDISO(pt, &maxfct, &mnum, &mtype, &phase,
      &n, (AT*)&a[0], (int*)&ia[0], (int*)&ja[0], &idum, &idum,
      iparm, &mkl_msglvl, &ddum, &ddum, &error);

phase = 22;
  callPARDISO (pt, &maxfct, &mnum, &mtype, &phase,
      &n, (AT*)&a[0], (int*)&ia[0], (int*)&ja[0], &idum, &idum,
      iparm, &mkl_msglvl, &ddum, &ddum, &error);

By the way, I tried to modify the mkl example pardiso_sym_0_based.c to reproduce this problem but got compiling error

./source/pardiso_sym_0_based.c: In function ‘main’:                                                                              
./source/pardiso_sym_0_based.c:120: error: ‘PARDISO_ENV_PARAM’ undeclared (first use in this function)                           
./source/pardiso_sym_0_based.c:120: error: (Each undeclared identifier is reported only once                                     
./source/pardiso_sym_0_based.c:120: error: for each function it appears in.)                                                     
./source/pardiso_sym_0_based.c:120: error: expected ‘;’ before ‘param’                                                           
./source/pardiso_sym_0_based.c:121: error: stray ‘@’ in program                                                                  
./source/pardiso_sym_0_based.c:121: error: ‘param’ undeclared (first use in this function)

As an workaround, I set the OOC file name using the configuration file.

I am using MKL 11.1.0

Major version: 11
Minor version: 1
Update version: 0
Product status:  Product
Build: n20130711

0 Kudos
12 Replies
xian-zhong_guous_cd-
632 Views

I am using gnu compiler:

make sointel64 function=pardiso_sym_0_based compiler=gnu

----- Compiling gnu_lp64_parallel_intel64_so ----- pardiso_sym_0_based
gcc -m64  -w -I"../../include" \
        ./source/pardiso_sym_0_based.c  \
        -L"../../lib/intel64" -lmkl_intel_lp64 \
        -lmkl_gnu_thread \
        -lmkl_core \
         -L"../../../compiler/lib/intel64" -liomp5 -lpthread -lm -ldl -o _results/gnu_lp64_parallel_intel64_so/pardiso_sym_0_based.out
./source/pardiso_sym_0_based.c: In function ‘main’:
./source/pardiso_sym_0_based.c:110: error: ‘PARDISO_ENV_PARAM’ undeclared (first use in this function)
./source/pardiso_sym_0_based.c:110: error: (Each undeclared identifier is reported only once
./source/pardiso_sym_0_based.c:110: error: for each function it appears in.)
./source/pardiso_sym_0_based.c:110: error: expected ‘;’ before ‘param’
./source/pardiso_sym_0_based.c:111: error: ‘param’ undeclared (first use in this function)
./source/pardiso_sym_0_based.c:111: error: ‘fname’ undeclared (first use in this function)
make[1]: *** [pardiso_sym_0_based] Error 1
make: *** [sointel64] Error 2

 

0 Kudos
xian-zhong_guous_cd-
632 Views

here is the trace back:

SIGSEGV: memory access exception
Command: StepSimulation
   Recoverability: Non-recoverable
   ServerStack: [
libStarNeo.so: SignalHandler::signalHandlerFunction(int, siginfo*, void*),
libpthread.so.0(),
libmkl_core.so(mkl_pds_lp64_check_precision_c+0x46),
libmkl_core.so(mkl_pds_lp64_pardiso+0x83),
libDirectSolver.so: MKLPARDISO<double>::numericalFact(),

0 Kudos
mecej4
Honored Contributor III
632 Views

Let's establish a basic fact. You took a standard MKL example source code that works correctly, and modified it. The modified source code caused the compiler to emit error messages. You have shown many disjointed pieces of code, but did not state exactly what the modifications to the file pardiso_sym_0_based.c were. Nevertheless, you report error messages and ask for a fix. This is not a reasonable thing to do without spending considerable effort, to say the least, and, quite possibly, impossible.

This is what I suggest: describe the modifications precisely and completely, or attach the modified source file.

 

0 Kudos
xian-zhong_guous_cd-
632 Views

I did want to reproduce the bug using your example but it gives me compiling error (see above and below). The only way to reproduce the bug is my integration given above.

make sointel64 function=pardiso_sym_0_based compiler=gnu

----- Compiling gnu_lp64_parallel_intel64_so ----- pardiso_sym_0_based

gcc -m64  -w -I"../../include" \

        ./source/pardiso_sym_0_based.c  \

        -L"../../lib/intel64" -lmkl_intel_lp64 \

        -lmkl_gnu_thread \

        -lmkl_core \

         -L"../../../compiler/lib/intel64" -liomp5 -lpthread -lm -ldl -o _results/gnu_lp64_parallel_intel64_so/pardiso_sym_0_based.out

./source/pardiso_sym_0_based.c: In function ‘main’:

./source/pardiso_sym_0_based.c:110: error: ‘PARDISO_ENV_PARAM’ undeclared (first use in this function)

./source/pardiso_sym_0_based.c:110: error: (Each undeclared identifier is reported only once

./source/pardiso_sym_0_based.c:110: error: for each function it appears in.)

./source/pardiso_sym_0_based.c:110: error: expected ‘;’ before ‘param’

./source/pardiso_sym_0_based.c:111: error: ‘param’ undeclared (first use in this function)

./source/pardiso_sym_0_based.c:111: error: ‘fname’ undeclared (first use in this function)

make[1]: *** [pardiso_sym_0_based] Error 1

make: *** [sointel64] Error 2

0 Kudos
xian-zhong_guous_cd-
632 Views

If you tell me how to fix the compiling error using your example, I will try to reproduce the bug using your example.

0 Kudos
mecej4
Honored Contributor III
632 Views

Inside a C source file, instead of

PARDISO_ENV_PARAM param = PARDISO_OOC_FILE_NAME;

you should write

enum PARDISO_ENV_PARAM penv = PARDISO_OOC_FILE_NAME;

However, since the example pardiso_sym_0_based.c does not do any out of core solution, calling pardiso_setenv() does not do anything significant.

0 Kudos
xian-zhong_guous_cd-
632 Views

Attached please find the modified pardiso_sym_0_based.c which reproduce the crash.

0 Kudos
mecej4
Honored Contributor III
632 Views

Did you forget to attach the file? If you have problems doing so, you can also paste short source code inline, using the "{....}/code" button in the toolbar and selecting "C++".

0 Kudos
xian-zhong_guous_cd-
632 Views
/*
********************************************************************************
*   Copyright(C) 2004-2014 Intel Corporation. All Rights Reserved.
*   
*   The source code, information  and  material ("Material") contained herein is
*   owned  by Intel Corporation or its suppliers or licensors, and title to such
*   Material remains  with Intel Corporation  or its suppliers or licensors. The
*   Material  contains proprietary information  of  Intel or  its  suppliers and
*   licensors. The  Material is protected by worldwide copyright laws and treaty
*   provisions. No  part  of  the  Material  may  be  used,  copied, reproduced,
*   modified, published, uploaded, posted, transmitted, distributed or disclosed
*   in any way  without Intel's  prior  express written  permission. No  license
*   under  any patent, copyright  or  other intellectual property rights  in the
*   Material  is  granted  to  or  conferred  upon  you,  either  expressly,  by
*   implication, inducement,  estoppel or  otherwise.  Any  license  under  such
*   intellectual  property  rights must  be express  and  approved  by  Intel in
*   writing.
*   
*   *Third Party trademarks are the property of their respective owners.
*   
*   Unless otherwise  agreed  by Intel  in writing, you may not remove  or alter
*   this  notice or  any other notice embedded  in Materials by Intel or Intel's
*   suppliers or licensors in any way.
*
********************************************************************************
*   Content : MKL PARDISO C example
*
********************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

#include "/u/xeons46/people/xian/intel/composer_xe_2013_sp1.3.174/mkl/include/mkl_pardiso.h"
#include "mkl_types.h"
#include "mkl.h"

MKL_INT main (void)
{
    /* Matrix data. */
/* my example */
    MKL_INT n = 5;
    MKL_INT ia[6] = { 0, 1, 2, 3, 4, 5};
    MKL_INT ja[5] = { 0, 1, 2, 3, 4};
    double a[5] = {1.0, 1.0, 1.0, 1.0, 1.0};
    double b[5], x[5];
/* end of my example */
/* original example */
/*
    MKL_INT n = 8;
    MKL_INT ia[9] = { 0, 4, 7, 9, 11, 14, 16, 17, 18};
    MKL_INT ja[18] =
    { 0,   2,       5, 6,
        1, 2,    4,
           2,             7,
              3,       6,
                 4, 5, 6,
                    5,    7,
                       6,
                          7
    };
    double a[18] =
    { 7.0,      1.0,           2.0, 7.0,
          -4.0, 8.0,      2.0,
                1.0,                     5.0,
                     7.0,           9.0,
                          5.0, 1.0, 5.0,
                              -1.0,      5.0,
                                   11.0,
                                         5.0
    };
    double b[8], x[8];
*/
/* end of original example */

    MKL_INT mtype = -2;       /* Real symmetric matrix */
    /* RHS and solution vectors. */
    MKL_INT nrhs = 1;     /* Number of right hand sides. */
    /* Internal solver memory pointer pt, */
    /* 32-bit: int pt[64]; 64-bit: long int pt[64] */
    /* or void *pt[64] should be OK on both architectures */
    void *pt[64];
    /* Pardiso control parameters. */
    MKL_INT iparm[64];
    MKL_INT maxfct, mnum, phase, error, msglvl;
    /* Auxiliary variables. */
    MKL_INT i;
    double ddum;          /* Double dummy */
    MKL_INT idum;         /* Integer dummy. */
/* -------------------------------------*/
/* .. Setup Pardiso control parameters. */
/* -------------------------------------*/
    for ( i = 0; i < 64; i++ )
    {
        iparm = 0;
    }
    iparm[0] = 1;         /* No solver default */
    iparm[1] = 2;         /* Fill-in reordering from METIS */
    iparm[3] = 0;         /* No iterative-direct algorithm */
    iparm[4] = 0;         /* No user fill-in reducing permutation */
    iparm[5] = 0;         /* Write solution into x */
    iparm[7] = 2;         /* Max numbers of iterative refinement steps */
    iparm[9] = 13;        /* Perturb the pivot elements with 1E-13 */
    iparm[10] = 1;        /* Use nonsymmetric permutation and scaling MPS */
    iparm[12] = 0;        /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
    iparm[13] = 0;        /* Output: Number of perturbed pivots */
    iparm[17] = -1;       /* Output: Number of nonzeros in the factor LU */
    iparm[18] = -1;       /* Output: Mflops for LU factorization */
    iparm[19] = 0;        /* Output: Numbers of CG Iterations */
    iparm[34] = 1;        /* PARDISO use C-style indexing for ia and ja arrays */
    maxfct = 1;           /* Maximum number of numerical factorizations. */
    mnum = 1;         /* Which factorization to use. */
    msglvl = 1;           /* Print statistical information in file */
    error = 0;            /* Initialize error flag */
/* ----------------------------------------------------------------*/
/* .. Initialize the internal solver memory pointer. This is only  */
/*   necessary for the FIRST call of the PARDISO solver.           */
/* ----------------------------------------------------------------*/
    for ( i = 0; i < 64; i++ )
    {
        pt = 0;
    }
    enum PARDISO_ENV_PARAM penv = PARDISO_OOC_FILE_NAME;
    PARDISO_SETENV(pt, &penv, "/OOC");
/* --------------------------------------------------------------------*/
/* .. Reordering and Symbolic Factorization. This step also allocates  */
/*    all memory that is necessary for the factorization.              */
/* --------------------------------------------------------------------*/
    phase = 11;
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &error);
    if ( error != 0 )
    {
        printf ("\nERROR during symbolic factorization: %d", error);
        exit (1);
    }
    printf ("\nReordering completed ... ");
    printf ("\nNumber of nonzeros in factors = %d", iparm[17]);
    printf ("\nNumber of factorization MFLOPS = %d", iparm[18]);
/* ----------------------------*/
/* .. Numerical factorization. */
/* ----------------------------*/
    phase = 22;
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, &ddum, &ddum, &error);
    if ( error != 0 )
    {
        printf ("\nERROR during numerical factorization: %d", error);
        exit (2);
    }
    printf ("\nFactorization completed ... ");
/* -----------------------------------------------*/
/* .. Back substitution and iterative refinement. */
/* -----------------------------------------------*/
    phase = 33;
    iparm[7] = 2;         /* Max numbers of iterative refinement steps. */
    /* Set right hand side to one. */
    for ( i = 0; i < n; i++ )
    {
        b = 1;
    }
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, a, ia, ja, &idum, &nrhs, iparm, &msglvl, b, x, &error);
    if ( error != 0 )
    {
        printf ("\nERROR during solution: %d", error);
        exit (3);
    }
    printf ("\nSolve completed ... ");
    printf ("\nThe solution of the system is: ");
    for ( i = 0; i < n; i++ )
    {
        printf ("\n x [%d] = % f", i, x);
    }
    printf ("\n");
/* --------------------------------------*/
/* .. Termination and release of memory. */
/* --------------------------------------*/
    phase = -1;           /* Release internal memory. */
    PARDISO (pt, &maxfct, &mnum, &mtype, &phase,
             &n, &ddum, ia, ja, &idum, &nrhs,
             iparm, &msglvl, &ddum, &ddum, &error);
    return 0;
}

 

0 Kudos
mecej4
Honored Contributor III
632 Views

OK, now I can see the error appearing even on Windows with the 14.0.2.176 Icl and the associated MKL 11.1.3 (IA32 and X64). I think that the combination that causes the access violation to occur is (i) a diagonal matrix, and (ii) a call to pardiso_setenv before the first call to pardiso().

The program in #10 uses a OOC file path of /OOC, but changing this to a file name in one of the user's directories where there is no access problem still causes the access violation to occur.

0 Kudos
Ying_H_Intel
Employee
632 Views

Hi   xian-zhong.guous.cd-adapco.com  ,  mecej4

Thanks much for the discussion. I can reproduce the problem. look like we have a memory corruption in diagonal matrix computation. The issue have been escalated to our developer, will keep you update if any news. 

Thanks

Ying

0 Kudos
Ying_H_Intel
Employee
632 Views

Dear all, 

I heard from our developer, the issue have been fixed and the fixed code will be in MKL 11.2.1, which is target to be release around Nov or Dec.  You are welcome to try it and let us know if any issue at that time. 

Thanks

Ying 

 

0 Kudos
Reply