- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Hi,
I've written this code in fortran to compute the sum of 1/(k*(k+1)) for k from 1 to 10**9 with a double precision accuracy.
===========================
program main
implicit none
integer, parameter :: ip = 8
integer, parameter :: rp = kind(1.0d0)
integer, parameter :: size = 10**9
integer(kind=ip) :: i
real(kind=rp) :: sum
sum =0.0_rp
do i = size,1,-1
sum = sum + 1.0_rp/real(i*(i+1),rp)
end do
write (*,*) "The result is : ",sum
end program main
===========================
I've realised that with the option -parallel, the program is 2 times faster than without it. I am quite amazed that the compiler can parallelize this program.Of course, one core can compute the sum for i from 1 to 5*(10**8), another one can compute the sum for i from 5*(10**8)+1 to 10**9 and the 2 sums can be added. But for that, the compiler has to know that :
((((a_1+a_2)+a_3)+a_4)+...)+a_(10**9)
is the same as
((((a_1+a_2)+a_3)+a_4)+...)+a_(5*(10**8)) + ((((a_(5*(10**8)+1)+a_(5*(10**8)+2))+...)+a_(10**9))
which is not "obvious". Is it what the compiler does ?
Thanks for your answers,
Francois
링크가 복사됨
7 응답
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
In fact, the compiler only sees that the terms do not depend on each other. Therefore, the loop could be arbitrary reordered in exact arithmetic. However, for floating point arithmetic, the compiler just assumes that reordering does not produce significant round-off errors. So you can never be sure in which order the terms are actually added up. This may depend on the system, number of threads etc.
If you want to be sure, in which order terms are added up, you need to parallelize the program manually and split up the loop _by hand_ (using OpenMP reductions on loops are again not deterministic)
If you want to be sure, in which order terms are added up, you need to parallelize the program manually and split up the loop _by hand_ (using OpenMP reductions on loops are again not deterministic)
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Quoting mecej4
I doubt that any compiler is "smart" enough to see that the sum from k=1:n is exactly 1 - 1 / ( n + 1 ).
It is interesting that, in this example, you get a wrong result when you sum from 1 to 10**9.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Quoting TimP (Intel)
In case you wished to sum to 10**10, you would need to take care to promote such expressions to 64-bit integers.
As a matter of fact, I already have to do that as I compute i*(i+1) as an integer expression for i = 10**9.
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Quoting TimP (Intel)
in the code you posted, you didn't use 64-bit integers in your parameter calculation, and you have an undefined variable in your write. So, it's hard to guess what you actually tested.
I am not using a 64 bit integer for size but I am using it for i.
