Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Questions about parallelization.

velvia
초급자
953 조회수
Hi,
I've written this code in fortran to compute the sum of 1/(k*(k+1)) for k from 1 to 10**9 with a double precision accuracy.
===========================
program main
implicit none
integer, parameter :: ip = 8
integer, parameter :: rp = kind(1.0d0)
integer, parameter :: size = 10**9
integer(kind=ip) :: i
real(kind=rp) :: sum
sum =0.0_rp
do i = size,1,-1
sum = sum + 1.0_rp/real(i*(i+1),rp)
end do
write (*,*) "The result is : ",sum
end program main
===========================
I've realised that with the option -parallel, the program is 2 times faster than without it. I am quite amazed that the compiler can parallelize this program.
Of course, one core can compute the sum for i from 1 to 5*(10**8), another one can compute the sum for i from 5*(10**8)+1 to 10**9 and the 2 sums can be added. But for that, the compiler has to know that :
((((a_1+a_2)+a_3)+a_4)+...)+a_(10**9)
is the same as
((((a_1+a_2)+a_3)+a_4)+...)+a_(5*(10**8)) + ((((a_(5*(10**8)+1)+a_(5*(10**8)+2))+...)+a_(10**9))
which is not "obvious". Is it what the compiler does ?
Thanks for your answers,
Francois
0 포인트
7 응답
fah10
새로운 기여자 I
953 조회수
In fact, the compiler only sees that the terms do not depend on each other. Therefore, the loop could be arbitrary reordered in exact arithmetic. However, for floating point arithmetic, the compiler just assumes that reordering does not produce significant round-off errors. So you can never be sure in which order the terms are actually added up. This may depend on the system, number of threads etc.

If you want to be sure, in which order terms are added up, you need to parallelize the program manually and split up the loop _by hand_ (using OpenMP reductions on loops are again not deterministic)
0 포인트
mecej4
명예로운 기여자 III
953 조회수
I doubt that any compiler is "smart" enough to see that the sum from k=1:n is exactly 1 - 1 / ( n + 1 ).
0 포인트
velvia
초급자
953 조회수
Quoting mecej4
I doubt that any compiler is "smart" enough to see that the sum from k=1:n is exactly 1 - 1 / ( n + 1 ).

It has been done on purpose, in order to check the result ;-)
It is interesting that, in this example, you get a wrong result when you sum from 1 to 10**9.
0 포인트
TimP
명예로운 기여자 III
953 조회수
In case you wished to sum to 10**10, you would need to take care to promote such expressions to 64-bit integers.
0 포인트
velvia
초급자
953 조회수
Quoting TimP (Intel)
In case you wished to sum to 10**10, you would need to take care to promote such expressions to 64-bit integers.

As a matter of fact, I already have to do that as I compute i*(i+1) as an integer expression for i = 10**9.

0 포인트
TimP
명예로운 기여자 III
953 조회수
in the code you posted, you didn't use 64-bit integers in your parameter calculation, and you have an undefined variable in your write. So, it's hard to guess what you actually tested.
0 포인트
velvia
초급자
953 조회수
Quoting TimP (Intel)
in the code you posted, you didn't use 64-bit integers in your parameter calculation, and you have an undefined variable in your write. So, it's hard to guess what you actually tested.

Sorry. It was my mistake for the variable undefined (I've now corrected it in my first post).

I am not using a 64 bit integer for size but I am using it for i.
0 포인트
응답