Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
27316 Discussions

I/O substantially slower today than yesterday

nooj
Beginner
231 Views
Hi, I have kind of a vague question. I use d-lines in my code for debugging writes, and optionally set or clear the -DD flag when I make. This is using ifort 11.1 20100401 on Mac and Linux.

Summary:
Is there a standard way to benchmark output to stdout via write(*,*)? My program suddenly got over 100x slower when doing writes (versus the same code with no writes), on multiple architectures.
By "suddenly" I mean, "sometime in the past two days, in the midst of a bunch of changes to the code and ifort compiler options." But then, I've been making such changes for years and not seen this behavior.
Background:
Recently, my executables compiled with -DD are UNBELIEVABLY slow! Over a hundred times slower than the non-DD executables! Usually, the difference in execution time is a factor of 2 or 3. The d-lines are always write(*,*)s. The I/O is so slow that the program is sitting at <5% cpu usage, and constantly reports "sleeping, wait_pipe" when i check the process status.
I pipe the output to disk: a.out > a.outfile &
It IS writing to disk: I can watch the file steadily increase in size as the program runs/sleeps.
I'm not doing all that many writes: about 25K lines of text, 76K words, and 850Kchars. In 2011, it should not take thirty minutes to write 850K bytesto disk, no matter what language or piping scheme I'm using. And last week didn't! So I don't know what's going on. (My phonecan download 1M file faster than this!)
Writes are as "write(*,*)", with 0-3 real*8 conversions (usually exactly one) and hardcoded string output, as in:
write(*,*) "P = ", P ! real*8 P
write(*,*) "vec1 = ", vec1(:) ! real*8 vec1(3)
The code always runs correctly with no errors; it's just sloooooooow.
- Nooj
0 Kudos
1 Solution
jimdempseyatthecove
Black Belt
231 Views
What happens when you run your app without redirection?
IOW output to the console.

Is this a multi-threaded application?
IOW does the slight delay in issuing the write(*,*) expose a multi-threaded programming problem.

Can you recall what changes were made?
IOW run a difference program and see if anyting stands out.

Jim Dempsey

View solution in original post

6 Replies
mecej4
Black Belt
231 Views
Did you recently, for some reason, switch to a remote working directory, or are the writes redirected to a remote directory?

Can you disconnect from the network and reproduce the slow run speeds?

Changes to your virus checker?
jimdempseyatthecove
Black Belt
232 Views
What happens when you run your app without redirection?
IOW output to the console.

Is this a multi-threaded application?
IOW does the slight delay in issuing the write(*,*) expose a multi-threaded programming problem.

Can you recall what changes were made?
IOW run a difference program and see if anyting stands out.

Jim Dempsey
nooj
Beginner
231 Views
> What happens when you run your app without redirection?
> IOW output to the console.
This was it. I recently changed the output processing script and accidentally slowed it down.
Surprising that a.out, if you run it via
./my.exe | tee my.outfile | my.output_process_script
will run very slowly if my.output_process_script is very slow. And by "surprising," I mean, "not surprising in retrospect." I guess the shell has a finite buffer for passing output from one program to the next. And the write(*,*) will only return if the shell says the output was successfully received.
- Nooj
jimdempseyatthecove
Black Belt
231 Views
There are loosely two implementations of "|" (pipe):

MS-DOS-like where stdout of the complete run of the application is written to a temp file, then followed by a run of the receiver app with stdin redirected to the temp file [same for subsequent pipes]

And a "concurrent"-like pipe whereby the data flows through the pipe during the application run. Perhaps line-by-line, "print"-by-"print", or buffer-by-buffer as per implementation.

Apparently your system is doing the latter. Consider using two lines, one to produce the my.outfile, followed by your script using my.outfile

Jim Dempsey
nooj
Beginner
231 Views
>And a "concurrent"-like pipewhereby
> the data flows through the pipe
> during the application run.
> Perhaps line-by-line, "print"-by-"print",
> or buffer-by-buffer as per implementation.
This one happens to be buffer-by-buffer: for very short test runs, my main code would finish nearly instantly, and the script would finish a few seconds later. I didn't understand the behavior then, but now I do.
>Consider using two lines,
> one to produce the my.outfile,
> followed by your script using my.outfile
Yes, exactly. I've left a note in the documentation in case the problem comes up again later, for me or for someone else. For now, the script appears to be fast enough to keep up.
Thanks for the suggestions, all.
- Nooj
nooj
Beginner
231 Views
>Did you recentlyswitch to a remote working directory
> or are the writes redirected to a remote directory?
Good point, network latency is a likely culprit. My code is stored and compiled on network-mounted directories. But my run script goes out of its way to do all its I/O in sysadmin-designated local access work directories.
-n

Reply