- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I haved posted a topic about questons of "oop and efficiency", and get a lot of valuable advices!
I did some test, and some results i am not very sure.
So, i paste brief test code here, thanks.
case1-4 does same calculation in loop with different way, and get a quite different time cost.
compiled by ivf with default release optimization options.
run the case one by one, that means while run case1, code of case2-4 are remmed.
approximate average time costs in my laptop:
case1: 0.72s
case2: 0.41s
case3: 0.57s
case4: 0.72s
i think the difference should impute to vecterization, my question is:
1) It seems that when using derived type the loop is not vecterized, why? If so, OOP will lost much efficiency right?
2) why case3 cost less time than case1 and case 4?
module mdl1 type :: typ1 real(8) :: m, mm contains procedure, pass :: p procedure, nopass :: pp end type type(typ1), allocatable :: t1(:) contains subroutine p(t1) class(typ1) :: t1 t1%m = t1%m * t1%m + t1%m * 4 t1%mm = t1%mm ** 5 t1%mm = t1%mm * t1%mm t1%m = t1%m + t1%mm end subroutine subroutine pp(i) integer :: i t1(i)%m = t1(i)%m * t1(i)%m + t1(i)%m * 4 t1(i)%mm = t1(i)%mm ** 5 t1(i)%mm = t1(i)%mm * t1(i)%mm t1(i)%m = t1(i)%m + t1(i)%mm end subroutine end module program console1 use mdl1 integer :: i, j integer, parameter :: N = 50000000, K= 100 real :: time1, time2, time(K) real(8) :: m(N), mm(N) allocate (t1(N)) time3 = 0 do j = 1, K do i=1, N m(i) = j mm(i) =j t1(i)%m = j t1(i)%mm = j end do call CPU_TIME(time1) !case 1 do i=1, N !0.7193207 t1(i)%m = t1(i)%m * t1(i)%m + t1(i)%m * 4 t1(i)%mm = t1(i)%mm ** 5 t1(i)%mm = t1(i)%mm * t1(i)%mm t1(i)%m = t1(i)%m + t1(i)%mm end do !case 2 do i=1, N !0.4073185s m(i) = m(i) * m (i)+ m(i) * 4 mm(i) = mm(i) ** 5 mm(i) = mm(i) * mm(i) m(i) = m(i) + mm(i) end do !case 3 do i=1, N ! 0.5687795 call t1(i) % p end do !case 4 do i=1, N ! 0.7176050s call t1 % pp (i) end do call CPU_TIME(time2) time(j) = time2-time1 print*, m(1), mm(10000), t1(10000)%m, t1(1)%mm write(10,*) time(j) end do write(10,*) sum(time)/size(time) end program
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In your example, with derived type, you are processing data with stride 2, so there may be twice as much data moved as in your comparison stride 1 case. How efficiently this may be done may depend on your /arch: selection. In case the more difficult case can be vectorized, the opt-report should help show the differences.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have noticed optimization report, type procedures are all not vectorized, it says 'vectorization is possible but seems inefficient'.
I also tried to add directives like 'vector always, forceinline' to loops and type contains subs.
the time cost order is the same, and 'vector always' decreases time cost, but want i want to know is that:
why when calling type procedures, vectorization is not performed by default. And if the calculation is intensive, will it slow down the program much? I donnot think adding 'vector always' manually is a good manner.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page