Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Problem about p?geevx

StevenZhang233
Beginner
1,756 Views

Hi,

I am using routine pdgeevx_ to solve the eigenpairs of a non-symmetric real matrix. I found that although it can solve eigenvalues, there are some "-nan" eigenvectors. I am not sure whether they come from my inputs or somewhere else.

Here are my test files (blacs.h, main.c) and their output files (eigGoodM.txt, eigVec_ZBCYCLIC_Q_GoodM.txt). 4 processors are needed to run the test. The makefile I used to compile is also attached (makefile.txt).

I believe I am using 2024.0 version of MKL (/opt/intel/oneapi/mkl/2024.0/bin/mkl_link_tool). The compiler I am using is (gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).

Could you please have a look and give me some suggestions on it?

Thank you,

Steven

0 Kudos
6 Replies
Mark_L_Intel
Moderator
1,638 Views

Hello Steven,

   the following debug code placed after a call to  pdgeevx_:  

//for (int i = 0; i < m_loc*n_loc; i++) {
for (int i = 0; i < 4; i++) {
  if (isnan(Z_BLCYC[i])) {
     printf("rank %d, Error: nan value found in Z_BLCYC at index %d\n", rank, i);
  }
}

 returns NaN values only from ranks 2 and 3 (but not from ranks 0 and 1) with "mpirun -np 4 ./main":

rank 2, Error: nan value found in Z_BLCYC at index 0
rank 2, Error: nan value found in Z_BLCYC at index 1
...
rank 3, Error: nan value found in Z_BLCYC at index 0
rank 3, Error: nan value found in Z_BLCYC at index 1
...

  The similar debug code placed to workspace query, before a call to a pdgeevx_ does not return any NaNs. In other words, the second call that does real work produces NaNs but only from 2 ranks out for 4. Could you confirm that you see the same results?      

  

0 Kudos
StevenZhang233
Beginner
1,604 Views

Hi Mark,

Thank you for your reply. I tried adding the debug code after the calling of pdgeevx and found that all the 4 processors returned the error message, so NaN existed in all 4 processors. Here is the output.

Steven

0 Kudos
Mark_L_Intel
Moderator
1,559 Views

Hello Steven,

 

       Our developers are looking into your report. Thank you for posting at oneMKL Forum.

 

Mark 

0 Kudos
StevenZhang233
Beginner
1,488 Views

Hi Mark,

 

There is one thing that I would like to remind: I tried using pzgeevx to solve the same problem by replacing the data type from double to double _Complex. The routine pzgeevx can solve the problem without NaN.

 

Maybe this information is helpful.

 

Steven

0 Kudos
Mark_L_Intel
Moderator
1,426 Views

Hi Steven,

 

   Thanks, this informaiton does help.

 

Mark.

0 Kudos
Mark_L_Intel
Moderator
1,213 Views

Hi Steven,

 

  The issue was reproduced and thanks a lot for catching this problem. The Nans appear in the output due to a compiler bug. The fix will be a part of the oneMKL 2024.2 release. 

 

Mark.

0 Kudos
Reply