Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29236 Discussions

Which is faster to write Binary file or ASCII file?

vahid_s_
Novice
3,641 Views

Hi,

I performed a very simple test and the results surprised me. First I wrote a very large number of data ( double percision numbers) on a text file. Then I wrote the same data on a Binary file. I measured the process time for each case and surprisingly writing on the text file was faster. Is that correct or I am doing something wrong? 

Thanks!

Binary file: 

[fortran]

CALL CPU_TIME(Time1)
OPEN( 1, FILE='Cyrus_In.bin', STATUS='UNKNOWN', ACCESS='STREAM')  //I want to use stream access. 

DO 200 I=1,NDOF
WRITE (1) (Number(I))
200 CONTINUE

CLOSE(1)
CALL CPU_TIME(Time2)
Time3 = Time2 - Time1

[/fortran]

Text file:

[fortran]

CALL CPU_TIME(Time1)

OPEN (1,FILE='Cyrus_In.txt',STATUS='UNKNOWN')  
DO 270 I=1,NDOF 
WRITE (1,2760) (Number(I))      //2760  FORMAT(E25.15)
270 CONTINUE

CLOSE(1)

CALL CPU_TIME(Time2) 
Time3 = Time2 - Time1

[/fortran]



0 Kudos
18 Replies
mecej4
Honored Contributor III
3,641 Views
I am doing something wrong?
Yes: (i) do not use stream access unless you need it; (ii) write as much in a single WRITE statement as you can. In the example above, replace the DO loop by WRITE(1)Number(1:NDOF).
0 Kudos
vahid_s_
Novice
3,641 Views

mecej4 Thanks for your quick and useful reply. I replaced the DO loop with one line WRTIE statement and it reduced the time process. But what if I have a more complicated DO loop like this: 

[fortran]

DO 275 I=1,N1
DO 274 J=1,N2
IF (S(I,J).NE.0) THEN
WRITE (1) I
WRITE (1) J+I-1
WRITE (1) (S(I,J))
ENDIF

274 CONTINUE 

275 CONTINUE

[/fortran]

 How can I make this more efficient? 

0 Kudos
John_Campbell
New Contributor II
3,641 Views

You could try the following changes for binary file. I'd expect that unless NDOF is very large, Time3 = 0

[fortran]
 CALL CPU_TIME (Time1)
!
 OPEN ( UNIT=11, FILE='Cyrus_In.bin', STATUS='UNKNOWN',    &
        FORM='UNFORMATTED', ACCESS='SEQUENTIAL', IOSTAT=iostat)
!
 WRITE (11) NDOF
 WRITE (11) Number(1:NDOF)
 CLOSE (11)
!
 CALL CPU_TIME (Time2)
 Time3 = Time2 - Time1
[/fortran]

0 Kudos
John_Campbell
New Contributor II
3,641 Views

Could I ask a question that has puzzled me for a long time:
This forum is "intel-visual-fortran-compiler-for-windows";   thats WINDOWS

Why doesn't it support windows file formats when using cut and paste ?

Some of us are Windows users

0 Kudos
TimP
Honored Contributor III
3,641 Views

Does it really matter how much CPU time is consumed by these writes?  What about elapsed time, which might be much greater?  What device are you writing to? Did you use one of the methods to set buffered_io, as you would do for best performance?

0 Kudos
John_Campbell
New Contributor II
3,641 Views

Tim,

With Windows 7, default buffered I/O has been signficantly improved, in comparison to XP.
Do the available methods to set buffered_io still provide much of a performance improvement ?

I agree with your comment on CPU time. SYSTEM_CLOCK would be more relevant choice, although the precision might be a problem. Elapse_time might help.

John

0 Kudos
Paul_Curtis
Valued Contributor I
3,641 Views

It's much faster to ditch the fortran and use the WinAPI functions directly. As the example makes clear, there is zero intervening code or formatting or anything other than a block transfer directly from memory: [fortran] IF (WriteFile (ihandl, & ! file handle loc_pointer, & ! address of data nbytes, & ! byte count to write LOC(nact), & ! actual bytes written NULL_OVERLAPPED) == 0) THEN ! Error writing file END IF [/fortran]

0 Kudos
vahid_s_
Novice
3,641 Views

Thanks for comments

0 Kudos
vahid_s_
Novice
3,641 Views

Does it really matter how much CPU time is consumed by these writes?  What about elapsed time

TimP, it is the fisrt time that Iam trying to measure the process time and I have no experience in that. So you mean CPU time is not the process time and I have to use elapsed time?  if the CPU time in ASCII format is less than the Binary format then I think the elapsed time must be less as well. Am I right? Can you give me an example for elapsed time? 

0 Kudos
vahid_s_
Novice
3,641 Views

Did you use one of the methods to set buffered_io, as you would do for best performance?

How can I do that? Please give me an example. Thanks!

0 Kudos
andrew_4619
Honored Contributor III
3,641 Views

Paul Curtis wrote:
It's much faster to ditch the fortran and use the WinAPI functions directly....

Interesting is much faster? I guess I would also (having read a little) need ReadFile, CreateFile and CloseHandle do you have the correct fortran interfaces for these routines, I check the standard includes and didn't seem to find them.

0 Kudos
jimdempseyatthecove
Honored Contributor III
3,641 Views

Before you go the distance to impliment WinAPI, I suggest you impliment the other FORTRAN suggestions first. Make some test runs, then determine if you should change your focus from I/O time to compute time improvements. Your focus should be on making the overall program run in the shortest amount of time.

Jim Dempsey

0 Kudos
TimP
Honored Contributor III
3,641 Views

"With Windows 7, default buffered I/O has been signficantly improved, in comparison to XP.
Do the available methods to set buffered_io still provide much of a performance improvement ?"

ifort flushes each record by default when performing record oriented writes.  You will not get the advantage of of Win7 buffering unless you set bufffered_io.  So it's possible these options could make more difference than before Win7.  There are several such options, some applying to all file units (not to stdin/stdout) such as the compile option /assume:buffered_io or equivalent environment variable, as well as the "buffered" keyword for OPEN and equivalent environment variables.  When these are set, data goes out to the physical device as the buffers fill, or when the unit is closed or flushed.

0 Kudos
John_Campbell
New Contributor II
3,641 Views

Paul,

I would be interested to see the comparison tests to support your claims. I did tests about 2 years ago of a range of options for direct access file structures. The file sizes were from 0.1 gb to 8gb in size, with records about 64kb. 
The elapsed time results showed no significant difference between the I/O library alternatives, with the big changes coming when changing from XP to Win 7 OS, increased installed memory and also from HDD to SSD.
I found that staying with standard conforming Fortran was the best approach.
I think that one of the reasons is that, given the optimisation that is available in most Fortran I/O, the dominant time usage comes from the O/S management of disk transfers and buffering, which is outside of the fortran libraries. 
I'm sure that the selection of file and record size could change the result.

I'd also expect that ifort's buffered I/O options are less effective now with Win 7 and Win 8 than they might have been with Win NT or XP, although I've never tested these alternatives. What has changed in the last 20 years is that CPU rates had increased at a faster rate than disk rates, although SSD has changed this somewhat.

If you have an example that shows differently, I would be interested to see it.

John

0 Kudos
Paul_Curtis
Valued Contributor I
3,641 Views
John, I have not bothered with test scenarios comparing standard fortran file i/o with the WinAPI version, although that would not be difficult. I started using this approach a long time ago and have found that it is not only faster, but provides much more versatility. REWIND and BACKSPACE made sense with magtape (and I'm way old enough to have been programming back then), but have become more than a bit antique (risible, in fact). When IVF compiles for Windows, fortran statements relating to i/o and memory allocation will be realized as sets of WinAPI calls, that's as atomic as one can get in Windows programming. It's not to difficult to see how fortran's formatted and record-oriented syntax would be resolved into fundamental API calls. But skipping all that and using the API calls directly cannot fail to be more efficient than having the compiler do the job. And this approach enables fortran i/o as a direct block-move of memory to/from any file, port or pipe structure which has a Windows handle, and is completely independent of the format in which data is represented in that memory block. A set of sample routines illustrating WinAPI file i/o is attached.
0 Kudos
vahid_s_
Novice
3,641 Views

The test I performed was on Wndows7 . I did the same test on my XP computer and got completely different results! On XP binary file was faster than text file while on windows7 text file was faster. 

I read the comments but since I am not specilaized in computer I can not exactly understand them! Is there any simple explanation for that? Is there any simple way to make binary faster on windows7?

I am now using Elapsed time and also did the suggested changes.  

0 Kudos
Steven_L_Intel1
Employee
3,641 Views

Look at the BUFFERED option for OPEN. I will comment that writing lots of small unformatted records is going to be less efficient than writing fewer, larger records.

0 Kudos
John_Campbell
New Contributor II
3,641 Views

Vahid,

This is the first case where I have heard that XP performs better than Windows 7 for fortran file I/O. This is not my experience.

John

0 Kudos
Reply