- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have some old fortran code that is starting to reach its limits. We would like to maintain its use, but until we are able to develop a new version, we need to continue to use it. We are currently only able to use this program when it is compiled in 32-bit mode and there is one particular program that is currently giving an "insufficient virtual memory" error. I've posted a minimal snippet from the program below. In it, I use the variable normal_sq to set up a matrix at four different sizes. Lines 11-16 test compiler32 since we use those to compile the software we use. I hit the maximum matrix size at 21674 x 21674 with the 32-bit compilations. Lines 17-22 use the 64-bit compiler and hit the maximum size at 92682. I'm trying to extend the 32-bit compilation to use a larger matrix size, is there a way around this error with ifort in 32-bit mode? The system itself has plenty of memory (64 GB).
$ module -s load compiler32 mkl32
$ ifort -traceback allocate_test.f90 -qmkl -o allocate_test
$ ./allocate_test
Allocating matrix of size 21673 x 21673 (always works)
Allocating matrix of size 21674 x 21674 (fails with: -m32 or compiler32 mkl32)
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
allocate_test 0805198E Unknown Unknown Unknown
allocate_test 08051B36 Unknown Unknown Unknown
allocate_test 0804AB60 MAIN__ 15 allocate_test.f90
allocate_test 0804A90A Unknown Unknown Unknown
program allocate_test
implicit none
!Simple test to test allocation maximum
integer:: thread_id, nthreads
! Variables
double precision, allocatable :: normal_sq(:,:)
double precision constant
write(*,*) "Allocating matrix of size 21673 x 21673 (always works)"
allocate(normal_sq(21673,21673))
deallocate(normal_sq)
write(*,*) "Allocating matrix of size 21674 x 21674 (fails with: -m32 or compiler32 mkl32)"
allocate(normal_sq(21674,21674))
deallocate(normal_sq)
write(*,*) "Allocating matrix of size 92681 x 92681 (works with: compiler mkl)"
allocate(normal_sq(92681,92681))
deallocate(normal_sq)
write(*,*) "Allocating matrix of size 92682 x 92682 (fails with: compiler mkl)"
allocate(normal_sq(92682,92682))
deallocate(normal_sq)
stop
end program
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Two things to investigate:
1) Allocate your largest arrays first. And then do not deallocate them. Keep them and re-use them. If necessary, use a smaller subsection of the array for smaller use on subsequent iteratons. The arrays can have TARGET, then use a pointer to declare a smaller sub-section.
2) Windows has a 3GB feature. This might provide for a bit more virtual address space. Note, this applies to 32-bit Windows. There may be an analog to this on 64-bit Windows for 32-bit applications.
OK - so let's quickly recap what we've discussed so far. The /3GB switch is not related to the amount of physical memory installed in a system. It is useful if you have an application that can take advantage of a larger address space. For a process to access the full 3GB address space, the image file must have the IMAGE_FILE_LARGE_ADDRESS_AWARE flag set in the image header.
If the flag is not set in the image header, then the OS reserves the third gigabyte so that the application won't see virtual addresses greater than 0x7FFFFFFF. You set this flag by specifying the linker flag /LARGEADDRESSAWARE when building the executable. This flag has no effect when running the application on a system with a 2-GB user address space. Therefore if you enable the /3GB switch, then applications that do not have this flag set can only use the standard 2GB of User mode memory, and the Kernel is still limited to the 1GB space - which means that 1GB of virtual memory is basically wasted!
However, this appears to apply to 32-bit O/S. But it might be worth exploring on a 32-bit application running on 64-bit O/S.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jim, thanks for the quick response. I should have mentioned that this is being run on red hat linux machines using RHEL 9.1. Are you aware of any similar solutions for linux?
Regarding the matrix allocations: I see! When I swapped the matrix allocations to ensure the larger matrix was allocated first, the code was able to complete successfully. So it sounds like the operational code might be re-using a matrix that was previously allocated and requested a larger version later on. As you can probably tell, I'm not much of a fortran programmer. This might be tricky to implement, but I think I understand the problem now if that's what's happening. I found an example here: https://fortran-lang.org/en/learn/best_practices/multidim_arrays/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was reading this page: https://stackoverflow.com/questions/19781713/what-is-the-biggest-array-size-for-double-precision-in-fortran-90
And used the code there for some testing. I found that I hit the virtual memory error when I don't use the -qmkl flag but hit an overflow error when I do use it. They seem to allocate and deallocate to larger and larger arrays when iterating:
program allocate_test
implicit none
!Simple test to test allocation maximum
! Variables
double precision, allocatable :: a(:)
integer*4 i
do i=1,100
allocate(a(2**i))
a(size(a)) = 1
deallocate(a)
write(*,*) i
end do
end
$ ifort -m32 -traceback allocate_test.f90 -qmkl -check -o allocate_test
$ ./allocate_test
1
...
28
forrtl: severe (179): Cannot allocate array - overflow on array size calculation.
Image PC Routine Line Source
allocate_test 08060409 Unknown Unknown Unknown
allocate_test 080608E6 Unknown Unknown Unknown
allocate_test 0804AADC MAIN__ 11 allocate_test.f90
allocate_test 0804A90A Unknown Unknown Unknown
$ ifort -m32 -traceback allocate_test.f90 -check -o allocate_test
$ ./allocate_test
1
...
27
forrtl: severe (41): insufficient virtual memory
Image PC Routine Line Source
allocate_test 0806069E Unknown Unknown Unknown
allocate_test 08060846 Unknown Unknown Unknown
allocate_test 0804AA3C MAIN__ 11 allocate_test.f90
allocate_test 0804A86A Unknown Unknown Unknown
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>I'm not much of a Fortran programmer.
Become language agnostic. Use whatever works best.
>>When I swapped the matrix allocations to ensure the larger matrix was allocated first, the code was able to complete successfully.
When memory is tight, you must pay close attention to allocations such as to not fragment the heap such that the (later) larger allocations fail to locate a node of sufficient size. This applies to all languages.
>> Multidimensional Arrays link
These are fine to use.... but as with a single dimension array, under tight memory constrictions you must be careful to not fragment the heap.
Of particular interest for you from this link is the last example where they use a pointer (of different rank in this case) to point to an array. In the example, they pointed to the entire array. In your case, consider pointing to a (contiguous) subsection of an array. IOW you can make your initial allocation to the largest expected requirements for that named array, but under a different name, then use a pointer to point to the slice of the size you want. This in effect becomes a single node heap.
module blobs
double precision, allocatable, target :: normal_sq_blob(:)
! ... other blobs here
contains
subroutine init_blobs
integer size_normal_sq
size_normal_sq = 1000*1000 ! call somewher_to_get(size_normal_sq)
allocate(normal_sq_blob(size_normal_sq))
! ... other get & allocate blobs here
end subroutine init_blobs
end module blobs
program Console16
use blobs
implicit none
call init_blobs()
call doWork()
end program Console16
subroutine doWork
use blobs
implicit none
double precision, contiguous, pointer :: normal_sq(:,:)
integer :: dim1Size, dim2Size
dim1Size = 123; dim2Size = 456 ! The sizes you need
! replace allocate(normal_sq(dim1Size, dim2Size)) with
if(dim1Size * dim2Size > size(normal_sq_blob)) STOP "allocation error"
normal_sq(1:dim1Size, 1:dim2Size) => normal_sq_blob
! ...
end subroutine doWork
Jim Dempsey
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem with your code is that you assume whatever heap manager is linked into your program, that it performs consolidations of adjacent free nodes. This is not always the case.
IOW your code presumes that (my presumption) that:
a) the heap has a single free node of maximum remaining size of memory.
b) first allocation will extract a node for current allocation, leaving a free node of the remainder
c) corresponding deallocation returns the memory resulting in a single free node of maximum remaining size of memory (original max)
Case c) is not necessarily the case. Often, the deallocaton results in two nodes in the heap. You are not assured that the heap manager will consolidate the nodes. Most systems provide ways to handle deallocation. In Windows, its called Low-Fragmentation Heap. I haven't looked at the Linux manual in a while, that is something you can do, to determine the default behavior and how you might override it with the behavior you seek. Try this and see what you get.
program allocate_test
implicit none
!Simple test to test allocation maximum
! Variables
double precision, allocatable :: a(:)
integer*4 i, iStat
do i=100,1, -1
allocate(a(2**i), STAT=iStat)
if(iStat == 0) then
print *,"first largest allocation size = ", size(a), i
a(size(a)) = 1
deallocate(a)
exit
endif
end do
end program allocate_test
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If this is running on the same server, also make sure there is enough space left on the disk for swap:
swapon -s
and as a sanity check, make sure root disk is not out of space
df -k
the swap partition should be 8GB or thereabouts in a normal server.
Question: in this code, are the arrays allocatable OR are they in COMMON? Or statically declared at a fixed size in the main program?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page