Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

large memory usage with large file

Matt_C_2
Beginner
819 Views

Hello. I have been debugging a curious issue with an application consuming a large amount of memory when writing or reading a large file.

The program below will write out about 1.5GB of data,

The TaskManager will show the memory usage slowly creeping up, but the process associated to the program does not appear to use the memory directly,

Instead, the memory is consumed by a memory-mapped file tied to the data file. (I used Microsoft's RamMap to find it.)

   program Console2

    implicit none
    INTEGER I
    OPEN(UNIT=7,STATUS='NEW',ACCESS='DIRECT',RECL=4, FORM='UNFORMATTED',FILE='MYDAT')
    DO 20 I=1,100000000
        WRITE(7,REC=I) 'DATA'
20  CONTINUE
    CLOSE(7)
    end program Console2

In the real program, the files are much larger and thus consume even more memory. This causes the program to crash with insufficient memory most but not all of the time.

It also affects both Win32 and x64, though I am predominantly using the latter.

Thanks for any suggestions or help you can provide.

-Matt

 

 

0 Kudos
11 Replies
Roman1
New Contributor I
819 Views

I think the problem is related to I/O Buffering (  /assume:buffered_io ) .  Can you try enabling it and see what happens. 

 

0 Kudos
Matt_C_2
Beginner
819 Views

 

That seems to improve things. With that option, the test program used only about 500MB of memory, rather than 1.5GB.

I then doubled the file size to 3GB and the /assume:buffered_io option kept the memory size to the same limit of around 500MB.

(Previously doubling the file size doubled the amount of memory consumed.)

Also, without the option, all of the file details listed in RamMap are marked as 'Active'.

With the option set, most of the file details are listed as 'Standby,' with a few marked as Modified, so the 500GB is probably just used because it is available.

I will try it with the original program and data file sizes.

Thanks!

-Matt

 

 

 

 

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
819 Views

You might also wan to experiment using STREAM writes. DIRECT may infer you may also intent to (immediately) re-read the file.

On an x32 environment 500MB may be too much to deal with.

Jim Dempsey

0 Kudos
Roman1
New Contributor I
819 Views

Matt, I am glad that it improved things for you.  However, I have read the Fortran documentation more carefully, and I don't understand why adding  /assume:buffered_io  made a difference.  In your code, you open the file with  ACCESS='DIRECT' .  From the documentation:

https://software.intel.com/en-us/node/579519

If a file is opened for direct access, I/O buffering is ignored.

0 Kudos
Matt_C_2
Beginner
819 Views

 

I was previously unaware of the /assume:buffered_io option, but I had tried various combinations of BUFFERED, BLOCKSIZE,and BUFFERCOUNT. They did not seem to affect the real program.

In the real program, as with the test program, without the assume option, all of the details in the memory-mapped file are marked as 'ACTIVE'. With the option, some are marked ACTIVE, some are marked MODIFIED, and some are marked as STANDBY as expected.

Also, with the real program and the assume option, it will use whatever memory is available all this machine (8GB) in x64, but I am able run many, many additional programs during the run without crashing, again as expected.

I will also take a look at the 'STREAM' option as I was unaware of it. The data files are however re-read, several times, and sometimes immediately.


 

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
819 Views

>>The data files are however re-read, several times, and sometimes immediately

If they are re-read sequentially, then you do not need DIRECT...

... but if they are re-read randomly, then you need DIRECT. In which case memory mapped files may be useful provided that the file can be partitioned/partially mapped based on available resources .OR. specific tuning parameters.

Jim Dempsey

0 Kudos
IanH
Honored Contributor II
819 Views

Stream access lets you do random access too.  With unformatted stream you can reposition to any arbitrary file storage unit (byte) in the file.  With formatted stream you can reposition to any position that you previously remembered.

0 Kudos
Steven_L_Intel1
Employee
819 Views

It looks to me as if the memory use is by Windows and not under the control of the Fortran I/O system.

0 Kudos
Matt_C_2
Beginner
819 Views

Yes, the memory appears to be used by Windows. But the example program doesn't memory-map the data file -- it just writes a lot of data to an old standard, Fortran data file.

Yet, depending on the number bytes written, the program/data file can consume all available memory and won't give any back until the program terminates!

The /assume:buffered_io makes huge difference for my both my test and real program, so I don't think it's just a Windows issue.

BTW, I am using Windows 7 and Intel Composer 2013.

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
819 Views

These links may be a good place to start digging:

http://www.ghacks.net/2010/07/08/increase-the-filesystem-memory-cache-size-in-windows-7/
(follow instructions to decrease or turn off)

http://www.thewindowsclub.com/enable-disable-disk-write-caching-windows-7-8

https://winntfs.com/2012/12/01/windows-write-caching-part-3-an-overview-for-system-administrators/

If you are lucky, you can find the tuning parameter that sets the size.

Please report back your solution.

Jim Dempsey

0 Kudos
Matt_C_2
Beginner
819 Views

Thanks Jim. I'll try to take a look. The solution that seems to work though is /assume:buffered_io.

According to the docs, it shouldn't. But it makes a major difference.

 

 

0 Kudos
Reply