- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear All,
I wrote two "identical" fortran routines : test_1 and test_2. Only line 29 is diffrent
test_1 line 29 is : a(i1) = sum(v(1:n)*b((j-1)*n +1:j*n))
test_2 line 29 is : c = sum(v(1:n)*b((j-1)*n +1:j*n))
But computation time is very different. Here are CPU time in secondes for different values of n.
test_1 time test_2 time
n = 500 0.905 0.016
n = 1000 7.207 0.016
n = 1500 24.523 0.047
n = 2000 58.641 0.109
I need to keep the array a. How can I change test_1 to make it as faster as test_2?
Best regards,
Didace
Ps : see source code bellow
---------------------------------------------------------------------
subroutine test_1 (a,b,n)
use const_m
implicit none
integer, intent(in) :: n
integer :: i, j, p
complex*16 :: c
complex*16, dimension(n), intent(inout) :: a
complex*16, dimension(n), intent(in ) :: b
complex*16, dimension(:), allocatable :: v
allocate(v(n))
do i=1,n
v(:) = a(i:(n-i1)*n +i:n)
p = i -n
do j=1,n
p = p +n
a(i1) = sum(v(1:n)*b((j-1)*n +1:j*n))
enddo
enddo
deallocate(v)
end subroutine test_1
---------------------------------------------------------------------
subroutine test_2 (a,b,n)
use const_m
implicit none
integer, intent(in) :: n
integer :: i, j, p
complex*16 :: c
complex*16, dimension(n), intent(inout) :: a
complex*16, dimension(n), intent(in ) :: b
complex*16, dimension(:), allocatable :: v
allocate(v(n))
do i=1,n
v(:) = a(i:(n-i1)*n +i:n)
p = i -n
do j=1,n
p = p +n
c = sum(v(1:n)*b((j-1)*n +1:j*n))
enddo
enddo
deallocate(v)
end subroutine test_2
---------------------------------------------------------------------
I wrote two "identical" fortran routines : test_1 and test_2. Only line 29 is diffrent
test_1 line 29 is : a(i1) = sum(v(1:n)*b((j-1)*n +1:j*n))
test_2 line 29 is : c = sum(v(1:n)*b((j-1)*n +1:j*n))
But computation time is very different. Here are CPU time in secondes for different values of n.
test_1 time test_2 time
n = 500 0.905 0.016
n = 1000 7.207 0.016
n = 1500 24.523 0.047
n = 2000 58.641 0.109
I need to keep the array a. How can I change test_1 to make it as faster as test_2?
Best regards,
Didace
Ps : see source code bellow
---------------------------------------------------------------------
subroutine test_1 (a,b,n)
use const_m
implicit none
integer, intent(in) :: n
integer :: i, j, p
complex*16 :: c
complex*16, dimension(n), intent(inout) :: a
complex*16, dimension(n), intent(in ) :: b
complex*16, dimension(:), allocatable :: v
allocate(v(n))
do i=1,n
v(:) = a(i:(n-i1)*n +i:n)
p = i -n
do j=1,n
p = p +n
a(i1) = sum(v(1:n)*b((j-1)*n +1:j*n))
enddo
enddo
deallocate(v)
end subroutine test_1
---------------------------------------------------------------------
subroutine test_2 (a,b,n)
use const_m
implicit none
integer, intent(in) :: n
integer :: i, j, p
complex*16 :: c
complex*16, dimension(n), intent(inout) :: a
complex*16, dimension(n), intent(in ) :: b
complex*16, dimension(:), allocatable :: v
allocate(v(n))
do i=1,n
v(:) = a(i:(n-i1)*n +i:n)
p = i -n
do j=1,n
p = p +n
c = sum(v(1:n)*b((j-1)*n +1:j*n))
enddo
enddo
deallocate(v)
end subroutine test_2
---------------------------------------------------------------------
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The "problem" you are seeing might not be a problem at all.
When you perform the "c = sum..." in the loop the optimization code of the compiler will note that c is not referenced in the loop, therefore all iterations of the loop excepting the last iteration may be eliminated.
Try inserting following "c = sum..."
if(c .eq. b) write(*,*) 'eq' ! use for timing test only
Where you expect c to never equal b
i.e. you want to insert an if statement using c that will never succeed. This will force the optimization code to not eliminate iterations of your loop.
Then run the timing test and compare the results.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
Thank Jim Dempsey,
I have run the test. I got the results then previously.
Didace
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Then try using the temp
[cpp]c = sum(v(1:n)*b((j-1)*n +1:j*n)) a(i1) = c [/cpp]
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jimdempseyatthecove
Sorry, It me again. I try a new test, you are right test_2 timing is equivalent to test_1 with if(c.eq.b)...
Best regrads,
Didace
Best regrads,
Didace
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So then the "problem" was optimization did not provide comparable example.
For speed-up try replacing the = sum(... with an equivilentloop
The purpose being to see if the compiler generates better vectorization of code.
Then next improvement (when n is large) would be to use OpenMP on the loop.
That is
assure vectorization is use when possible
then use parallization when appropriate
Jim Dempsey

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page