I'm building a vector-parallelized Fortran code. Looking at hot spots and waits, I found that
x_sum = zero
Do i 1,n
x_sum = x_sum + X(i)
X is a real.
preform differently. In my code the do-loop is ~30% faster. Note there are a number of summations involved so that reflects an overall result for a test case.
Must be my error but please point it out to me.
You did not post an entire program, so this is a guess. What is the size of X? Is is possible that SIZE(X) > n , and when you do x_sum=SUM(X) , you are summing over more elements?