- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
			
				
					
					
						I have been comparing the performance of g77 and ifort on an
AMD Athlon MP2800+. I note that ifort gives generally slightly
faster code, but is incredibly slow on I/O ! To test this further
I wrote the following little FORTRAN code:
PROGRAM IOTST
IMPLICIT REAL*8(A-H,O-Z)
INTEGER*4 TIME
REAL*4 ETIME, TARRAY(2)
PARAMETER(N=10000)
DIMENSION A(N,N)
CPU1 = ETIME(TARRAY)
WALL1 = TIME()
WRITE(1,*) A
CPU2 = ETIME(TARRAY)
WALL2 = TIME()
CPUT = CPU2-CPU1
WALLT = WALL2-WALL1
WRITE(6,*) 'CPU: ',CPUT, 'Wall: ',WALLT
END
and then I do
compute-0-1.local 66>g77 -O3 -ffast-math -fautomatic -fno-f2c
-fno-globals -Wno-globals io.f
compute-0-1.local 67>time a.out
CPU: 16.5900002Wall: 17.
13.820u 4.030s 0:18.81 94.8% 0+0k 0+0io 137pf+0w
compute-0-1.local 68>ifort -O3 -ip -w -tpp6 io.f
compute-0-1.local 69>time a.out
CPU: 279.440000305176 Wall: 326.000000000000
61.430u 218.030s 5:26.16 85.6% 0+0k 0+0io 178pf+0w
There is clearly a quite dramatic difference in timings ! What is the origin of this and can it be fixed ? How can I investigate this further ?
Best regards,
Trond Saue
		
		
	
	
	
AMD Athlon MP2800+. I note that ifort gives generally slightly
faster code, but is incredibly slow on I/O ! To test this further
I wrote the following little FORTRAN code:
PROGRAM IOTST
IMPLICIT REAL*8(A-H,O-Z)
INTEGER*4 TIME
REAL*4 ETIME, TARRAY(2)
PARAMETER(N=10000)
DIMENSION A(N,N)
CPU1 = ETIME(TARRAY)
WALL1 = TIME()
WRITE(1,*) A
CPU2 = ETIME(TARRAY)
WALL2 = TIME()
CPUT = CPU2-CPU1
WALLT = WALL2-WALL1
WRITE(6,*) 'CPU: ',CPUT, 'Wall: ',WALLT
END
and then I do
compute-0-1.local 66>g77 -O3 -ffast-math -fautomatic -fno-f2c
-fno-globals -Wno-globals io.f
compute-0-1.local 67>time a.out
CPU: 16.5900002Wall: 17.
13.820u 4.030s 0:18.81 94.8% 0+0k 0+0io 137pf+0w
compute-0-1.local 68>ifort -O3 -ip -w -tpp6 io.f
compute-0-1.local 69>time a.out
CPU: 279.440000305176 Wall: 326.000000000000
61.430u 218.030s 5:26.16 85.6% 0+0k 0+0io 178pf+0w
There is clearly a quite dramatic difference in timings ! What is the origin of this and can it be fixed ? How can I investigate this further ?
Best regards,
Trond Saue
Link kopiert
		5 Antworten
	
		
		
			
			
			
					
	
			- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
			
				
					
					
						I found part of the answer to the problem that I posted. The code
writes formatted files and if one compares file sizes one finds
that the file produced by g77 is 0.4GB whereas the file from ifort
is 2.3GB. The difference is due to the fact that ifort writes a huge number of decimals by default, whereas g77 only writes 0.0. Can this be modified ?
I next tested performance for unformatted output using the program:
PROGRAM IOTST
IMPLICIT REAL*8(A-H,O-Z)
INTEGER*4 TIME
REAL*4 ETIME, TARRAY(2)
PARAMETER(N=10000)
DIMENSION A(N,N)
OPEN(1,STATUS='UNKNOWN',FORM='UNFORMATTED',FILE='FILE')
CPU1 = ETIME(TARRAY)
WALL1 = TIME()
WRITE(1) A
CPU2 = ETIME(TARRAY)
WALL2 = TIME()
CPUT = CPU2-CPU1
WALLT = WALL2-WALL1
WRITE(6,*) 'CPU: ',CPUT, 'Wall: ',WALLT
END
Now I get
compute-0-1.local 41>g77 -O3 -ffast-math -fautomatic -fno-f2c -fno-globals -Wno-globals io2.f
compute-0-1.local 42>time a.out
CPU: 5.21999979Wall: 8.
0.000u 5.240s 0:08.53 61.4% 0+0k 0+0io 137pf+0w
compute-0-1.local 45>ll FILE
-rw-rw-r-- 1 saue saue 800000008 Jul 7 15:27 FILE
compute-0-1.local 46>ifort -O3 -ip -w -tpp6 io2.f
compute-0-1.local 47>time a.out
CPU: 2.91000000000000 Wall: 8.00000000000000
0.000u 2.930s 0:08.20 35.7% 0+0k 0+0io 180pf+0w
compute-0-1.local 48>ll FILE
-rw-rw-r-- 1 saue saue 800000008 Jul 7 15:28 FILE
and the performance of ifort is quite satisfactory !
		
		
	
	
	
writes formatted files and if one compares file sizes one finds
that the file produced by g77 is 0.4GB whereas the file from ifort
is 2.3GB. The difference is due to the fact that ifort writes a huge number of decimals by default, whereas g77 only writes 0.0. Can this be modified ?
I next tested performance for unformatted output using the program:
PROGRAM IOTST
IMPLICIT REAL*8(A-H,O-Z)
INTEGER*4 TIME
REAL*4 ETIME, TARRAY(2)
PARAMETER(N=10000)
DIMENSION A(N,N)
OPEN(1,STATUS='UNKNOWN',FORM='UNFORMATTED',FILE='FILE')
CPU1 = ETIME(TARRAY)
WALL1 = TIME()
WRITE(1) A
CPU2 = ETIME(TARRAY)
WALL2 = TIME()
CPUT = CPU2-CPU1
WALLT = WALL2-WALL1
WRITE(6,*) 'CPU: ',CPUT, 'Wall: ',WALLT
END
Now I get
compute-0-1.local 41>g77 -O3 -ffast-math -fautomatic -fno-f2c -fno-globals -Wno-globals io2.f
compute-0-1.local 42>time a.out
CPU: 5.21999979Wall: 8.
0.000u 5.240s 0:08.53 61.4% 0+0k 0+0io 137pf+0w
compute-0-1.local 45>ll FILE
-rw-rw-r-- 1 saue saue 800000008 Jul 7 15:27 FILE
compute-0-1.local 46>ifort -O3 -ip -w -tpp6 io2.f
compute-0-1.local 47>time a.out
CPU: 2.91000000000000 Wall: 8.00000000000000
0.000u 2.930s 0:08.20 35.7% 0+0k 0+0io 180pf+0w
compute-0-1.local 48>ll FILE
-rw-rw-r-- 1 saue saue 800000008 Jul 7 15:28 FILE
and the performance of ifort is quite satisfactory !
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
You are using unformatted I/O, which does not have the concept of "number of decimals".
I am confused by your second post, as the correct size of the file should be 0.8GB . Neither 0.4GB nor 3.2GB is correct, and this should not change due to the compiler.
Be aware that Linux caches file writes and the program may complete before all the writing is actually done.
					
				
			
			
				
			
			
			
			
			
			
			
		- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
The difference is that the array is uninitialized. When you ran with g77, you were apparentlty writing zeroes. With ifort, the data was not zero.
If you initialized the data, the file sizes should be comparable.
As you found, using unformatted I/O is a better approach for large volumes of data.
					
				
			
			
				
			
			
			
			
			
			
			
		- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
			
				
					
					
						I tired your suggestion, that is initializing the matrix, using the code:
PROGRAM IOTST
IMPLICIT REAL*8(A-H,O-Z)
PARAMETER (D0=0.0D0)
INTEGER*4 TIME
REAL*4 ETIME, TARRAY(2)
PARAMETER(N=10000)
DIMENSION A(N,N)
DO J = 1,N
DO I = 1,N
A(I,J)=D0
ENDDO
ENDDO
CPU1 = ETIME(TARRAY)
WALL1 = TIME()
WRITE(1,*) A
CPU2 = ETIME(TARRAY)
WALL2 = TIME()
CPUT = CPU2-CPU1
WALLT = WALL2-WALL1
WRITE(6,*) 'CPU: ',CPUT, 'Wall: ',WALLT
END
However, I get the same result as before:
compute-0-11.local 26>g77 -O3 -ffast-math -fautomatic -fno-f2c -fno-globals -Wno-globals io.f
compute-0-11.local 27>time a.out
CPU: 17.2200003Wall: 18.
15.600u 4.200s 0:20.38 97.1% 0+0k 0+0io 136pf+0w
compute-0-11.local 28>ll fort.1
-rw-rw-r-- 1 saue saue 405263158 Jul 8 13:01 fort.1
compute-0-11.local 29>ifort -O3 -ip -w -tpp6 io.f
compute-0-11.local 30>time a.out
CPU: 245.780001306534 Wall: 295.000000000000
63.720u 184.500s 4:57.09 83.5% 0+0k 0+0io 177pf+0w
compute-0-11.local 31>ll fort.1
-rw-rw-r-- 1 saue saue 2433333334 Jul 8 13:07 fort.1
g77 write 0.0, ifort writes 0.000000000000000E+000
All the best,
Trond Saue
		
		
	
	
	
PROGRAM IOTST
IMPLICIT REAL*8(A-H,O-Z)
PARAMETER (D0=0.0D0)
INTEGER*4 TIME
REAL*4 ETIME, TARRAY(2)
PARAMETER(N=10000)
DIMENSION A(N,N)
DO J = 1,N
DO I = 1,N
A(I,J)=D0
ENDDO
ENDDO
CPU1 = ETIME(TARRAY)
WALL1 = TIME()
WRITE(1,*) A
CPU2 = ETIME(TARRAY)
WALL2 = TIME()
CPUT = CPU2-CPU1
WALLT = WALL2-WALL1
WRITE(6,*) 'CPU: ',CPUT, 'Wall: ',WALLT
END
However, I get the same result as before:
compute-0-11.local 26>g77 -O3 -ffast-math -fautomatic -fno-f2c -fno-globals -Wno-globals io.f
compute-0-11.local 27>time a.out
CPU: 17.2200003Wall: 18.
15.600u 4.200s 0:20.38 97.1% 0+0k 0+0io 136pf+0w
compute-0-11.local 28>ll fort.1
-rw-rw-r-- 1 saue saue 405263158 Jul 8 13:01 fort.1
compute-0-11.local 29>ifort -O3 -ip -w -tpp6 io.f
compute-0-11.local 30>time a.out
CPU: 245.780001306534 Wall: 295.000000000000
63.720u 184.500s 4:57.09 83.5% 0+0k 0+0io 177pf+0w
compute-0-11.local 31>ll fort.1
-rw-rw-r-- 1 saue saue 2433333334 Jul 8 13:07 fort.1
g77 write 0.0, ifort writes 0.000000000000000E+000
All the best,
Trond Saue
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Interesting. g77 is violating the Fortran 77 standard, which requires use of an E format for a value of zero. (The Fortran 95 standard's wording in this area is unchanged.)
					
				
			
			
				
			
			
			
			
			
			
			
		 
					
				
				
			
		
					
					Antworten
					
						
	
		
				
				
				
					
						
					
				
					
				
				
				
				
			
			Themen-Optionen
			
				
					
	
			
		
	- RSS-Feed abonnieren
- Thema als neu kennzeichnen
- Thema als gelesen kennzeichnen
- Diesen Thema für aktuellen Benutzer floaten
- Lesezeichen
- Abonnieren
- Drucker-Anzeigeseite
