Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Problem about p?geevx

StevenZhang233
Principiante
2.702 Vistas

Hi,

I am using routine pdgeevx_ to solve the eigenpairs of a non-symmetric real matrix. I found that although it can solve eigenvalues, there are some "-nan" eigenvectors. I am not sure whether they come from my inputs or somewhere else.

Here are my test files (blacs.h, main.c) and their output files (eigGoodM.txt, eigVec_ZBCYCLIC_Q_GoodM.txt). 4 processors are needed to run the test. The makefile I used to compile is also attached (makefile.txt).

I believe I am using 2024.0 version of MKL (/opt/intel/oneapi/mkl/2024.0/bin/mkl_link_tool). The compiler I am using is (gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).

Could you please have a look and give me some suggestions on it?

Thank you,

Steven

0 kudos
6 Respuestas
Mark_L_Intel
Empleados
2.584 Vistas

Hello Steven,

   the following debug code placed after a call to  pdgeevx_:  

//for (int i = 0; i < m_loc*n_loc; i++) {
for (int i = 0; i < 4; i++) {
  if (isnan(Z_BLCYC[i])) {
     printf("rank %d, Error: nan value found in Z_BLCYC at index %d\n", rank, i);
  }
}

 returns NaN values only from ranks 2 and 3 (but not from ranks 0 and 1) with "mpirun -np 4 ./main":

rank 2, Error: nan value found in Z_BLCYC at index 0
rank 2, Error: nan value found in Z_BLCYC at index 1
...
rank 3, Error: nan value found in Z_BLCYC at index 0
rank 3, Error: nan value found in Z_BLCYC at index 1
...

  The similar debug code placed to workspace query, before a call to a pdgeevx_ does not return any NaNs. In other words, the second call that does real work produces NaNs but only from 2 ranks out for 4. Could you confirm that you see the same results?      

  

StevenZhang233
Principiante
2.550 Vistas

Hi Mark,

Thank you for your reply. I tried adding the debug code after the calling of pdgeevx and found that all the 4 processors returned the error message, so NaN existed in all 4 processors. Here is the output.

Steven

Mark_L_Intel
Empleados
2.505 Vistas

Hello Steven,

 

       Our developers are looking into your report. Thank you for posting at oneMKL Forum.

 

Mark 

StevenZhang233
Principiante
2.434 Vistas

Hi Mark,

 

There is one thing that I would like to remind: I tried using pzgeevx to solve the same problem by replacing the data type from double to double _Complex. The routine pzgeevx can solve the problem without NaN.

 

Maybe this information is helpful.

 

Steven

Mark_L_Intel
Empleados
2.372 Vistas

Hi Steven,

 

   Thanks, this informaiton does help.

 

Mark.

Mark_L_Intel
Empleados
2.159 Vistas

Hi Steven,

 

  The issue was reproduced and thanks a lot for catching this problem. The Nans appear in the output due to a compiler bug. The fix will be a part of the oneMKL 2024.2 release. 

 

Mark.

Responder