- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to create a custom CSHIFT function in order to easily maintain some strange boundary conditions. Playing around I made the following test program:
[fortran]PROGRAM cshifttest
USE IFPORT
IMPLICIT NONE
INTEGER, PARAMETER :: numtests = 10000
DOUBLE PRECISION, DIMENSION(300,300) :: a, b
REAL :: starttime, endtime
INTEGER :: i, j
777 FORMAT("=== ",A," took ",F16.8," seconds to run (",E16.8," seconds per shift)")
call srand(34533)
PRINT*,"Initializing"
DO i=1,SIZE(a,1)
DO j=1,SIZE(a,2)
a(i,j) = rand() * 100
END DO
END DO
PRINT*,"Runing cshift"
CALL cpu_time(starttime)
DO i=1,INT(numtests/2)
b = cshift(a, i, 1)
END DO
DO i=1,INT(numtests/2)
b = cshift(a, i, 2)
END DO
CALL cpu_time(endtime)
WRITE(*,777) "cshift", (endtime-starttime), (endtime-starttime)/numtests
PRINT*,"Running mshift"
CALL cpu_time(starttime)
DO i=1,INT(numtests/2)
b = mshift(a, i, 1)
END DO
DO i=1,INT(numtests/2)
b = mshift(a, i, 2)
END DO
CALL cpu_time(endtime)
WRITE(*,777) "mshift", (endtime-starttime), (endtime-starttime)/numtests
CONTAINS
FUNCTION mshift(array, shift, axis) result(shifted)
IMPLICIT NONE
DOUBLE PRECISION, DIMENSION(:,:) :: array
DOUBLE PRECISION, DIMENSION(SIZE(array,1),SIZE(array,2)) :: shifted
INTEGER :: shift, axis
shifted = CSHIFT(array, shift, axis)
shifted(1,:) = array(2,:)
shifted(SIZE(array,1),:) = array(SIZE(array,1)-1,:)
shifted(:,1) = array(:,2)
shifted(:,SIZE(array,1)) = array(:,SIZE(array,1)-1)
return
END FUNCTION
END PROGRAM[/fortran]When I run this with ifort (compiled with `ifort -O3 cshifttest.f90`) I get the following output:
$ ifort -O3 cshifttest.f90 && ./a.outInitializingRuning cshift=== cshift took 0.3439480 seconds to run ( 0.3439480E-04 seconds per shift)Running mshift=== mshift took 39.9719238 seconds to run ( 0.3997192E-02 seconds per shift)
On the otherhand, gfortran (compiled with `gfortran -O3 cshifttest.f90`, note that you must comment out the USE command on line 2) gives:
gfortran -O3 cshifttest.f90 && ./a.outInitializingRuning cshift=== cshift took 3.08553004 seconds to run ( 0.30855299E-03 seconds per shift)Running mshift=== mshift took 3.12652516 seconds to run ( 0.31265253E-03 seconds per shift)
I have the following versions installed:
$ ifort --versionifort (IFORT) 11.0 20090318Copyright (C) 1985-2009 Intel Corporation. All rights reserved.$gfortran --versionGNU Fortran (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44)Copyright (C) 2007 Free Software Foundation, Inc.
Furthermore, some information about the machine:
[bash]$ uname -a
Linux xxxxxxxxxx 2.6.18-128.1.14.el5 #1 SMP Wed Jun 17 06:38:05 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
$ free -m
total used free shared buffers cached
Mem: 32168 10655 21513 0 848 8436
-/+ buffers/cache: 1370 30798
Swap: 16002 82 15920
$ head -n 22 /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 15
model name : Intel Xeon CPU 5160 @ 3.00GHz
stepping : 11
cpu MHz : 2992.509
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc
pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
bogomips : 5989.08
clflush size : 64
cache_alignment : 64
address sizes : 38 bits physical, 48 bits virtual[/bash] Does anyone have any explination as to why the timings are so vastly different? I can understand why the intrinsic CSHIFT is faster with IFORT (simply becauese of all the optimizations and the fact that this machine runs on an intel chip), but I don't get why MSHIFT is SO much slower with IFORT. Can anyone recommend ways of implementing MSHIFT in a more optimized way? Note that the boundary conditions set in the current implemination are simply for testing, in the actual code they are bound to change and be much more intricate.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you want run-time tests to discover when a pair of plain vector moves will do the job, you might as well write those in to your function, rather than depending on CSHIFT being implemented that way.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are looking at measurements that are not meaningful. In fact, IFort 12.0 on a 3 GHz C2D E8400 gives:
To get meaningful measurements, it would be necessary to do something more with the returned values from the function calls.
[bash] InitializingThis indicates that the optimization did away with the "cshift run". The same could have been done with the "mshift run" as well if the compiler could figure out that the function is PURE.
Runing cshift
=== cshift took 0.00000000 seconds to run ( 0.00000000E+00 seconds per shift)
Running mshift
=== mshift took 5.84375000 seconds to run ( 0.58437500E-03 seconds per shift)
[/bash]
To get meaningful measurements, it would be necessary to do something more with the returned values from the function calls.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page