"Insufficient virtual memory" Error

pourmatin85 · ‎04-08-2009

Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

ArturGuzik · ‎04-08-2009

Quoting - pourmatin85

Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

Is this x64 cluster?

A.

pourmatin85 · ‎04-08-2009

yes it is.

draceswbell_net · ‎04-09-2009

Quoting - pourmatin85

Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

I have most commonly seen this whenever I run past the 2 GB limit of the standard fortran model. Sometimes a simple cure is using the -mcmodel=medium option. If you are using allocatable arrays, you might check to verify that the sizes are correct. If you are using OpenMP and have large private arrays, this can also use up memory rather quickly since you will allocate multiple arrays concurrently.

pourmatin85 · ‎04-10-2009

Quoting - draceswbell.net

I have most commonly seen this whenever I run past the 2 GB limit of the standard fortran model. Sometimes a simple cure is using the -mcmodel=medium option. If you are using allocatable arrays, you might check to verify that the sizes are correct. If you are using OpenMP and have large private arrays, this can also use up memory rather quickly since you will allocate multiple arrays concurrently.

Thanks for your reply. I checked "-mcmodel=" and it turned out that it's for linux only, is that right? However, I'm working in windows and I don't use OnepPM!

Any other suggestions?

Regards
Hossein

pourmatin85 · ‎04-10-2009

Quoting - pourmatin85

Dear guys;

I'm trying to run an FEM code of mine. The code works well until a certain number of input element (10,000). However, when I try to run the code with 11,000 elements this error appears:
forrtl: severe (41): insufficient virtual memory

Although my code is still very unoptimized since I use a regular solver (DGETRS) instead of sparse solvers, the cluster still has 4 GB of free memory when I run the code with 10,000 elements.

Is there any trick that I'm missing?
Regards
Hossein

Finally, I managed to use SCR format of sparse matrix and now I'm using PARDISO mkl sparse solver to solve my Ax=b equation. However, the problem with big matrices still remains!
When I try to run the program in x64 platform with 21,000 elements, the code stops the execution when it reaches calling PARDISO for the first time (phase 11) with this error:
program exception - access violation

And when I run it in win32 platform, the code execution just stops on the same place but with no error!!

Any ideas?

ArturGuzik · ‎04-10-2009

Quoting - pourmatin85

Finally, I managed to use SCR format of sparse matrix and now I'm using PARDISO mkl sparse solver to solve my Ax=b equation. However, the problem with big matrices still remains!
When I try to run the program in x64 platform with 21,000 elements, the code stops the execution when it reaches calling PARDISO for the first time (phase 11) with this error:
program exception - access violation

And when I run it in win32 platform, the code execution just stops on the same place but with no error!!

Any ideas?

You have some allocatable component deallocated/not yet allocated. Most probably. Anyway, it looks as a programming issue.

A.

pourmatin85 · ‎04-11-2009

Quoting - ArturGuzik

You have some allocatable component deallocated/not yet allocated. Most probably. Anyway, it looks as a programming issue.

A.

Then why does it work for lower amount of input elements!!

ArturGuzik · ‎04-12-2009

Quoting - pourmatin85

Then why does it work for lower amount of input elements!!

This is not a proof the code is correct. Following all your posts I have an impression that the code may have some issues with allocating/matching sizes. However, it's just a impression/guess. In all posts you always report that the code doesn't work with 11,000, then 21,000 etc. Is that coincidence, only, or something with odd (number of elements) sizes?

The strange one is that x64/IA32 error thing. Can you describe it more clearly (in more detail)? What are the linking lines (libs) in both configurations? Sometimes this kind of error(s) may have its source in mixed lib interfaces linked.

A.

pourmatin85 · ‎04-13-2009

for x64: mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
for IA32: mkl_solver.lib mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib

The numbers were just an example, I'm pretty sure it's not about being odd or even.

pourmatin85 · ‎04-13-2009

Quoting - ArturGuzik

Quoting - pourmatin85

Then why does it work for lower amount of input elements!!

This is not a proof the code is correct. Following all your posts I have an impression that the code may have some issues with allocating/matching sizes. However, it's just a impression/guess. In all posts you always report that the code doesn't work with 11,000, then 21,000 etc. Is that coincidence, only, or something with odd (number of elements) sizes?

The strange one is that x64/IA32 error thing. Can you describe it more clearly (in more detail)? What are the linking lines (libs) in both configurations? Sometimes this kind of error(s) may have its source in mixed lib interfaces linked.

A.

for x64: mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
for IA32: mkl_solver.lib mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib

The numbers were just an example, I'm pretty sure it's not about being odd or even.

ArturGuzik · ‎04-13-2009

Quoting - pourmatin85

for x64: mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib
for IA32: mkl_solver.lib mkl_intel_c.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib

The numbers were just an example, I'm pretty sure it's not about being odd or even.

Well, it looks strange. The linker setting is fine, and you should have no problem with MKL itself (for Ia32 setting you can omitt mkl_solver.lib as it's part of mkl_core.lib).
I saw your other post (on OCC solver) and I believe you don't need to use OOC version to manage 20,000 elements model (I assume that mentioning 20,000 x 20,000 matrix you mean model with 20,000 elements or DOFs. correct?). I was using DSS solver (from MKL) for solving my own FEM model(s) with +40,000 elements on 2 GB RAM IA32 WinXP system without problem. The sparse CRS format eliminates problems with large (original) matrix bandwidth and a need for nodes reordering.

The things to do/check would be:

(1) if the code is complex, make sure any (orginal) matrix is not left unnecessarily allocated
(2) check the input (say by printing, if debugging mode is not an option) before call to the MKL, and make sure you can access all array elements
(3) insert IMPLICIT NONE statements in all routines
(4) set explicit interfaces
(5) set: /check:[no]uninit and /Qtrapuv trying to catch errors in the code.

A.

pourmatin85 · ‎04-14-2009

Quoting - ArturGuzik

Well, it looks strange. The linker setting is fine, and you should have no problem with MKL itself (for Ia32 setting you can omitt mkl_solver.lib as it's part of mkl_core.lib).
I saw your other post (on OCC solver) and I believe you don't need to use OOC version to manage 20,000 elements model (I assume that mentioning 20,000 x 20,000 matrix you mean model with 20,000 elements or DOFs. correct?). I was using DSS solver (from MKL) for solving my own FEM model(s) with +40,000 elements on 2 GB RAM IA32 WinXP system without problem. The sparse CRS format eliminates problems with large (original) matrix bandwidth and a need for nodes reordering.

The things to do/check would be:

(1) if the code is complex, make sure any (orginal) matrix is not left unnecessarily allocated
(2) check the input (say by printing, if debugging mode is not an option) before call to the MKL, and make sure you can access all array elements
(3) insert IMPLICIT NONE statements in all routines
(4) set explicit interfaces
(5) set: /check:[no]uninit and /Qtrapuv trying to catch errors in the code.

A.

Thanks alot A.
My problem is solved. actually, my stiffness matrix had some diagonal zero elements!!

ArturGuzik · ‎04-14-2009

Quoting - pourmatin85

Thanks alot A.
My problem is solved. actually, my stiffness matrix had some diagonal zero elements!!

Glad to hear that.

Keep in mind that in solver there is an option (I believe) to check pivots info. You should also verify that your compressed matrix has non-zero diagonal elements. Then it would be apparent that your stiffness matrix is ...not a stiffness matrix.

A.