- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am using routine pdgeevx_ to solve the eigenpairs of a non-symmetric real matrix. I found that although it can solve eigenvalues, there are some "-nan" eigenvectors. I am not sure whether they come from my inputs or somewhere else.
Here are my test files (blacs.h, main.c) and their output files (eigGoodM.txt, eigVec_ZBCYCLIC_Q_GoodM.txt). 4 processors are needed to run the test. The makefile I used to compile is also attached (makefile.txt).
I believe I am using 2024.0 version of MKL (/opt/intel/oneapi/mkl/2024.0/bin/mkl_link_tool). The compiler I am using is (gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)).
Could you please have a look and give me some suggestions on it?
Thank you,
Steven
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Steven,
the following debug code placed after a call to pdgeevx_:
//for (int i = 0; i < m_loc*n_loc; i++) {
for (int i = 0; i < 4; i++) {
if (isnan(Z_BLCYC[i])) {
printf("rank %d, Error: nan value found in Z_BLCYC at index %d\n", rank, i);
}
}
returns NaN values only from ranks 2 and 3 (but not from ranks 0 and 1) with "mpirun -np 4 ./main":
rank 2, Error: nan value found in Z_BLCYC at index 0
rank 2, Error: nan value found in Z_BLCYC at index 1
...
rank 3, Error: nan value found in Z_BLCYC at index 0
rank 3, Error: nan value found in Z_BLCYC at index 1
...
The similar debug code placed to workspace query, before a call to a pdgeevx_ does not return any NaNs. In other words, the second call that does real work produces NaNs but only from 2 ranks out for 4. Could you confirm that you see the same results?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mark,
Thank you for your reply. I tried adding the debug code after the calling of pdgeevx and found that all the 4 processors returned the error message, so NaN existed in all 4 processors. Here is the output.
Steven
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Steven,
Our developers are looking into your report. Thank you for posting at oneMKL Forum.
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mark,
There is one thing that I would like to remind: I tried using pzgeevx to solve the same problem by replacing the data type from double to double _Complex. The routine pzgeevx can solve the problem without NaN.
Maybe this information is helpful.
Steven
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steven,
Thanks, this informaiton does help.
Mark.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steven,
The issue was reproduced and thanks a lot for catching this problem. The Nans appear in the output due to a compiler bug. The fix will be a part of the oneMKL 2024.2 release.
Mark.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page