- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, the limit for a 32 bit compiler is 2 GB. As you see when I use the 64 bit compiler I can run a program which requires 4.2 GB memory. With a 64 bit compiler you can have 128 GB, so why do I meet the limit on 4.2 GB when the computer has 6 GB RAM ?
Knut
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's not what I said. You are running out of stack space, not virtual memory. As far as I know, the stack on 64-bit Windows cannot exceed 2GB.
The amount of virtual memory is also limited by your pagefile space.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How can I run a program that requires more than 4.2 GB memory (the size of the matrices in the program) on a 64-bit computer ? I have Visual Studio 2005 Professional Edition. The paging file is 9.2 GB. Option /heap-arrays does not work, have you another trick ?
Knut
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Knut,
1) Make the large arrays allocatable (either as allocatable or by pointer).
2) Code such that temporary arrays are not auto-created for operations on your large arrays.
Consider using:
module CommonData
real, allocatable :: A(:,:)
real, allocatable :: B(:,:)
real, allocatable :: C(:,:)
...
end module CommonData
----------------------
program FOO
use CommonData
allocate(A(1234,123456))
allocate(B(1234,123456))
allocate(C(1234,123456))
...
! Avoid using statements creating array temporaries
! Change:
! call Sub(A+B)
! to:
C = A+B
call Sub(C)
...
end program FOO
Note, there are several forms of array expressions that will require the use of array temporaries. Avoid using those forms of expressions.
Although there is a compile time option /heap-arrays
and a runtime option /check:arg_temp_created
Unfortunately there is no diagnostic/warn:array_temp_created
Which means you have to run the application until it crashes due to lack of memory. This doesn't help much if the sensitive expression is in seldom used code. e.g. your customer experiences the problem while you cannot reproduce the problem.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have an application which makes use of FFTs, therefore n=32768 would be fine for me.
program memory_test
print 66
66 format('Size of matrices (m,n) :',$)
read(*,*)m,n
size=float(m)*n*8*2/1000000.
write(*,*)'Size of matrices in Megabytes :',size
call xmem(m,n)
end
subroutine xmem(m,n)
complex aa(m,n),bb(m,n)
write(*,*)'First loop starts'
do j=1,m
do i=1,n
aa(j,i)=cmplx(sin(float(j)),cos(float(i)))
enddo
enddo
write(*,*)aa(3,3)
write(*,*)'First loop ready'
do j=1,m
do i=1,n
bb(j,i)=cmplx(sin(float(j)),cos(float(i)))
enddo
enddo
write(*,*)'Second loop ready'
do j=1,m
do i=1,n
bb(j,i)=aa(j,i)*conjg(bb(j,i))
enddo
enddo
write(*,*)'Third loop ready'
end
Regards Knut
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Knut,
To use allocatable arrays:
subroutine xmem(m,n)
complex, allocatable :: aa(:,:),bb(:,:)
allocate(aa(m,n),bb(m,n))
! your code here
deallocate(aa,bb)
end subroutine xmem
If you wish the arrays to persist outside the subroutine then place the declaration inside a module. Then USE the module in the subroutine that performs the allocation as well as in all the other subroutines that reference the arrays (after allocation and population). Or if you prefer, pass the array references (aa and/or bb) down the call levels.
You can also add an optional argument STAT=sv to allocate anddeallocate.Where STAT= is a keyword and sv is an integer variable to receive a status indication. A return of 0 means success.
Remember to deallocate on return. I think the current version of IVF will auto deallocate local allocatable arrays upon return. It won't hurt to explicitly deallocate as this avoids portability issues.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Knut,
Look at the following modifications to your application.
Note that ALLOCATABLE is changed to POINTER for the allocatable arrays.
I had problems with OpenMP in sharing the arrays when declared as ALLOCATABLE but not when declared as POINTER. Bug, if you ask me.
My system has 4 cores but only 2GB. I could run your test using 8192,32768 (4GB) but it took forever as the array size was larger than physical memory. There was no memory fault. (when using POINTER)
Using 8192,8192 illustrates scalability.
Jim Dempsey
program memory_test
print 66
66 format('Size of matrices (m,n) :',$)
read(*,*)m,n
size=float(m)*n*8*2/1000000.
write(*,*)'Size of matrices in Megabytes :',size
call xmem(m,n)
end
subroutine xmem(m,n)
use omp_lib
complex, pointer :: aa(:,:),bb(:,:)
real(8) :: StartTime, EndTime, ElapseTime
integer :: NumberOfThreads
allocate(aa(m,n),bb(m,n))
write(*,*) 'Wipeing array (to commit Virtual Memory)'
StartTime = OMP_GET_WTIME()
aa = cmplx(0.0,0.0)
bb = cmplx(0.0,0.0)
EndTime = OMP_GET_WTIME()
ElapsTime = EndTime - StartTime
write(*,*) 'Run time ', ElapsTime
do NumberOfThreads=1, OMP_GET_MAX_THREADS()
call OMP_SET_NUM_THREADS(NumberOfThreads)
write(*,*)
write(*,*)'NumberOfThreads ', NumberOfThreads
write(*,*)
write(*,*)'First loop starts'
StartTime = OMP_GET_WTIME()
!$OMP PARALLEL DO PRIVATE(i,j)
do j=1,m
do i=1,n
aa(j,i)=cmplx(sin(float(j)),cos(float(i)))
enddo
enddo
!$OMP END PARALLEL DO
EndTime = OMP_GET_WTIME()
ElapsTime = EndTime - StartTime
write(*,*)aa(3,3)
write(*,*)'First loop ready'
write(*,*) 'Run time ', ElapsTime
write(*,*)
StartTime = OMP_GET_WTIME()
!$OMP PARALLEL DO PRIVATE(i,j)
do j=1,m
do i=1,n
bb(j,i)=cmplx(sin(float(j)),cos(float(i)))
enddo
enddo
!$OMP END PARALLEL DO
EndTime = OMP_GET_WTIME()
ElapsTime = EndTime - StartTime
write(*,*)'Second loop ready'
write(*,*) 'Run time ', ElapsTime
write(*,*)
StartTime = OMP_GET_WTIME()
!$OMP PARALLEL DO PRIVATE(i,j)
do j=1,m
do i=1,n
bb(j,i)=aa(j,i)*conjg(bb(j,i))
enddo
enddo
!$OMP END PARALLEL DO
EndTime = OMP_GET_WTIME()
ElapsTime = EndTime - StartTime
write(*,*)'Third loop ready'
write(*,*) 'Run time ', ElapsTime
end do
end
Size of matrices (m,n) :8192,8192
Size of matrices in Megabytes : 1073.742
Wipeing array (to commit Virtual Memory)
Run time 2.677683
NumberOfThreads 1
First loop starts
(0.1411200,-0.9899925)
First loop ready
Run time 12.08051
Second loop ready
Run time 12.17020
Third loop ready
Run time 29.14083
NumberOfThreads 2
First loop starts
(0.1411200,-0.9899925)
First loop ready
Run time 6.223104
Second loop ready
Run time 6.243929
Third loop ready
Run time 15.01080
N umberOfThreads 3
First loop starts
(0.1411200,-0.9899925)
First loop ready
Run time 4.348078
Second loop ready
Run time 4.384243
Third loop ready
Run time 10.78805
NumberOfThreads 4
First loop starts
(0.1411200,-0.9899925)
First loop ready
Run time 3.618164
Second loop ready
Run time 3.652418
Third loop ready
Run time 8.962400
----------
Summary loop 1 loop 2 loop 3
NumberOfThreads 1 12.08051 12.17020 29.14083
NumberOfThreads 2 6.223104 6.243929 15.01080
NumberOfThreads 3 4.348078 4.384243 10.78805
NumberOfThreads 4 3.618164 3.652418 8.96240
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim
I tried it. It works and I can run the program if it needs more than the physical memory on the computer. It does not crash as it did with my code. The code I used did work in Unix Fortran, I have used it for many years, but I see that I have to rewrite as you tellin Intel Fortran. I can also tell you (you may already know it) that I don't need the options /link and /stack:n any longer. I should have given you the code earlier...
Thank you very much
Knut
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim
Thanks a lot. May be that I also need to parallellize my application. I have not tried it yet.
Regards Knut
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Knut,
You might as well parallelize the code. Virtually any workstation you purchase today will have at least two processing cores, and soon four cores. Same with notebooks.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim
I have tried the simple test code which you parallelized for me, and it works fine and I can see that both CPUs on my computer are utilized. Now I can start to work with my SAR processing and simulation code. Thankyou very much again.
Knut
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page