Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7206 Discussions

Error in Fortran Linux build when calling pardiso with phase = -1

Ioannis_K_
New Contributor I
1,730 Views

Hello everybody,

 

I am trying to run a large fortran code that I have built in Windows and Linux environments. It is a very large code, so there no reason to share it here.

In some stage of the run, the code must call the pardiso routine, and the goal is to completely clear the memory associated with pardiso. In other words, I am invoking pardiso using phase = -1.

 

The Windows build of my program runs without any issue (x64 release configuration), but encounters a very strange issue in the Linux build (in the Linux build, I use the traceback option to ensure that I know where the error has occurred). Both builds use a 2022 version of the oneAPI ifort compiler, together with the corresponding mkl library. Also, both the Windows and Linux builds use O2 optimization and also openMP.

 

The call to pardiso with phase = -1 is in two regions of a particular subroutine. When it is called from the first region, there is no problem. When it is called from the second region, my program always erroneously aborts with the following message:

 

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image PC Routine Line Source
libpthread-2.17.s 00002AAAB419E630 Unknown Unknown Unknown
libmkl_core.so.2 00002AAAAF89AB6B mkl_serv_free Unknown Unknown
libmkl_core.so.2 00002AAAB10DDB84 mkl_pds_lp64_meti Unknown Unknown
libmkl_core.so.2 00002AAAB101BB4D mkl_pds_lp64_pard Unknown Unknown
libmkl_core.so.2 00002AAAB13CFBD4 mkl_pds_lp64_pard Unknown Unknown

 

It may be clear from the above traceback information, that the error occurs in a pre-built function of the mkl library, and (at first glance) it appears to be a memory-related issue.

I have tried rebuilding with various options (using heap arrays, increasing stack size etc.), but none of them works. I have also tried to invoke pardiso with iparm(27) = 1 (to see whether an error message is printed during the call to release the memory), but I do not get any message whatsoever. 

 

Another change that I checked was calling pardiso with phase = 0 (instead of phase = -1), and the error vanishes, the Linux build runs without any issue! The problem that I have with using phase = 0 is that, after doing some test runs in my Windows build, I noticed that the memory is not "properly" cleared; in other words, I see a continuous increase in the memory usage of my code, which is not there when I use phase = -1.

 

I wanted to ask whether there is any general explanation on what may be causing the specific error to occur when I try to release the memory with pardiso, and whether there is a solution that I could try to avoid having the error, while also ensuring that there are no "memory leaks" from my use of pardiso.

 

Many thanks in advance for any help/advice.

0 Kudos
6 Replies
ShanmukhS_Intel
Moderator
1,661 Views

Hi Ioannis Koutromanos,


As the error message indicates a segmentation fault, it might be a memory issue where you are trying to access memory that it is not authorized to access.


Could you please make sure you are allocating and freeing memory as expected, and that you are not accessing memory beyond its allocated boundaries?


Update libraries: Ensure that you are using the latest version of the Intel MKL library. It's possible that the issue you're experiencing might have been fixed in a recent release. Could you please update to the latest oneAPI and let us know if the issue persists?


Segmentation faults can sometimes occur due to insufficient memory. Make sure that your system has enough available memory to run the program.


There might be a chance where you are freeing a memory that was already freed.


If you are still unable to resolve the issue, we would like to request you an isolated sample of your code so we could try it at our end.


For your reference, the significance of phase 0 and phase -1 is mentioned below.


phase = 0 => Release internal memory for L and U of the matrix number mnum.

phase = -1 => Release all internal memory for all matrices.


Best Regards,

Shanmukh.SS


0 Kudos
Ioannis_K_
New Contributor I
1,649 Views

Thank you Shanmukh.

 

I think trying to free memory that was already freed may indeed be the cause. I will look into this. 

 

I am absolutely certain that the problem is not due to unauthorized memory access or insufficient memory in my system, for the following reasons:

- My Linux system (where the error occurs) has 240 GB of memory, and the Windows system (where the code runs without any error) has 16GB.

- Also, the sample run is extremely small (I expect the required memory to be a fraction of the CPU cache memory).

- If the memory in the system was insufficient (or if I was trying to access unauthorized memory), then I would still get an error when I use phase = 0 (instead of phase = -1). As I mentioned in my original message, the code runs without any problem when I call pardiso with phase = 0 instead of phase = -1.

- Finally I would expect that errors due to insufficient system memory would probably manifest when pardiso is trying to ALLOCATE extra memory (for instance, when it tries to factorize the coefficient matrix), and certainly not when it tries to release memory. 

 

If the problem persists, I will send you a version of my code. If possible, please provide me with an email address to send the code to, as I would prefer not to share my code through the user forum.

 

Again, many thanks for all the help and guidance.

0 Kudos
ShanmukhS_Intel
Moderator
1,600 Views

Hi Ioannis,


If the problem persists, I will send you a version of my code.

>>Sure, You could share your code if the issue persists. It would be a great help so that we could try reproducing the issue in our environment and help you accordingly.


Best Regards,

Shanmukh.SS


0 Kudos
Ioannis_K_
New Contributor I
1,591 Views

Thank you Shanmukh.

The problem persists, so I would like to try and send you a sample version of my code so that you reproduce it on your end. I would prefer not to share the code through the forum, so-if possible - please provide me with an email that I can send it to (or feel free to email me directly with instructions).

 

Once again, thank you for all your help.

 

0 Kudos
ShanmukhS_Intel
Moderator
1,565 Views

Hi Ioannis,

 

In case you require privacy, and are unable to share the issue/sample with us publicly and If you are a licensed oneAPI product customer and/or member of Intel’s oneAPI Academic Program please submit a ticket for  Priority support so that your application can be handled with the required data protection and privacy regulations.

 

Best Regards,

Shanmukh.SS

0 Kudos
ShanmukhS_Intel
Moderator
1,510 Views

Hi Ioannis,


A gentle reminder:

As mentioned earlier, In case you require privacy and are unable to share the issue/sample with us publicly and If you are a licensed oneAPI product customer and/or member of Intel’s oneAPI Academic Program please submit a ticket for  Priority support so that your application can be handled with the required data protection and privacy regulations.

 

Best Regards,

Shanmukh.SS


0 Kudos
Reply