- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a code which behaves incorrectly if anything above -O0 is
specified.
More specifically, the piece of code that looks like being responsible is something like this:
do i=1,nx
do j=1,ny
flux(i,j) = "calculate x derivative of matrix field at (i,j)"
enddo
enddo
do i=1,nx
do j=1,ny
flux(i,j) = flux(i,j) +
"calculate y derivative of matrix field at (i,j)"
enddo
enddo
The problem occuring, after testing on suitable matrix fields, is that the two blocks of code are being swapped around at execution time, which should not happen, and does not yield the right answer. (By the way, I know calculating both derivatives in the same loop would avoid this problem, but I am not free to modify the code as I wish).
Is this expected behaviour on the compiler's behalf?
If so, is there a compilation flag which can allow me to apply optimisation without code "rearrangement"?
I have a code which behaves incorrectly if anything above -O0 is
specified.
More specifically, the piece of code that looks like being responsible is something like this:
do i=1,nx
do j=1,ny
flux(i,j) = "calculate x derivative of matrix field at (i,j)"
enddo
enddo
do i=1,nx
do j=1,ny
flux(i,j) = flux(i,j) +
"calculate y derivative of matrix field at (i,j)"
enddo
enddo
The problem occuring, after testing on suitable matrix fields, is that the two blocks of code are being swapped around at execution time, which should not happen, and does not yield the right answer. (By the way, I know calculating both derivatives in the same loop would avoid this problem, but I am not free to modify the code as I wish).
Is this expected behaviour on the compiler's behalf?
If so, is there a compilation flag which can allow me to apply optimisation without code "rearrangement"?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel compilers should be attempting loop interchange and fusion only at -O3. According to what you have presented, both would be needed, to make the code at all efficient.
I would not be surprised, in numerical differentiation, to find that the situation is more complicated than we can see from what you have presented. If you are using a current Intel compiler, could you file an example of the problem on your premier.intel.com account?
I would not be surprised, in numerical differentiation, to find that the situation is more complicated than we can see from what you have presented. If you are using a current Intel compiler, could you file an example of the problem on your premier.intel.com account?

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page