I'm using the intel parallel studio (XE2018 update 1 cluster edition) and the mkl library feast eigenvalue solver.
During calling the subroutine zfeast_hcsrgv in the debug mode, I met the following runtime error, related with access violation.
The problem is that this error irregulary pops up, that is, rerunning the code with the same input sometimes does not trigger the runtime error.
I attached an example code, but as I said before, reproducing of this error in your running is not guaranteed.
In my research code of performing iterative calling of the subroutine zfeast_hcsrgv, this runtime error always pops up, however, the triggering position of calculation step is mostly different.
I have tried to check the memory allocations, but I couldn't find any mistakes.
I hope the attached code would be helpful to understand the issue.
I reattached the sample code which includes consecutive execution of the subroutine zfeast_hcsrgv. I have checked that this code always triggering the runtime error in the second execution of the subroutine.
The error is indicative of indexing an array out of bounds, writing to it, and in the process overwriting an array descriptor of something else. Then subsequent call into MKL using an array with the corrupted array descriptor.
Have you run a Debug build with run time checks for array index out of bounds?
Additional potential cause: If your program uses OpenMP and you use the clause private with an array, change the private for the array(s) to firstprivate. This will copy in the empty array descriptor (V18u1 should have fixed this, it wont hurt to use firstprivate).
Dear Jim Dempsey,
Thanks for reply. However, I'm still struggling with the same difficulty.
(1) Running the code with runtime checks (selecting the runtime error checking "ALL") still gives me the same results.
Actually, I'm not sure that this is correct following of your suggestion, but just executing the program with checking "ALL" the runtime error check yields the same results of runtime error. I couldn't find any additional error messages from it.
(2) I think OpenMP is not used in my program, as I didn't treat anything related with it explicitly.
I just checked the property option "Use Intel Math Kernel Library" as "Parallel(\Qmkl:parallel)".
Surely, different selection of the option as "Sequential" or "Cluster" also triggered the same runtime error.
(3) The most strange thing puzzles me is that this error is triggered very randomly. Re-execution of the program sometimes does not trigger the error.
I found an error in my input, which turned out to be not exactly hermitian due to some numerical noise.
I have posted another question, recently below. The key is to use only the upper or lower triangular part of the matrix input instead of full matrix to avoid troubles due to slight deviation from hermitian matrix property.