How can the Fortran Performance improve by Precomputing Scalars used in an expression ??

ravi_0602 · ‎12-16-2024

I've noticed a significant performance improvement in Fortran by restructuring an expression. Initially, the expression is:

result = (ev(1)**0.5) * u1 + (ev(2)**0.5) * u2 + (ev(3)**0.5) * u3

Where ev(1), ev(2), and ev(3) are updated in each loop iteration, and u1, u2, and u3 are matrices which also change with each iteration of the loop.

These expressions are in a function which is called in every iteration.

After separating the square root calculations outside the main expression:

a = ev(1)**0.5
b = ev(2)**0.5
c = ev(3)**0.5

result = a * u1 + b * u2 + c * u3

The improvement in calculation time due to this rearrangement is massive.

To give an idea,
My code which took 103 seconds to run with the older version took only 75 seconds to run with the restructured version.

Can anyone kindly help me on what could be the possible reason for this. It would also greatly help if you could provide me with any literature or documentation of the reason as i could not find anything when i browsed the internet.

Thank you very much in advance !!

Steve_Lionel · ‎12-16-2024

I think you'll need to provide a test case people can build and run. In my experience, code snippets and "looks like" code often don't accurately represent the real code. I would not expect a simple rearrangement of the expression to result in such an improvement - there may be some inlining issues as well.

ravi_0602 · ‎12-16-2024

Hi @Steve_Lionel , thanks for the reply.

I am not in a position to provide a test case, but I can assure you that it was the only change i did to my code.

I am basically working on improving the efficiency of a UMAT (which is written in fortran) in LS Dyna. The above-mentioned change was the only one which I happen to do by mistake and came up with such huge improvement.

The results of my simulation also match with both the variants.

It would be helpful if you could think of some possible reason based on which I can build further. I am relatively very new to fortran.

Thanks in Advance !!

Steve_Lionel · ‎12-16-2024

Sorry, there's nothing more I can do with the limited information you provided. The compiler is generally very good at rearranging expressions for more efficient computation, but there's something else going on not apparent from what you have provided so far. It could be that your original code failed to vectorize. There is an optimization report available that may give some clues. Perhaps others in this forum have some thoughts.

ravi_0602 · ‎12-16-2024

@Steve_Lionel

Does this help?

This is the original code in which just the last line was modified as stated in the discussion before.

Thanks in advance !!

Ron_Green · ‎12-16-2024

Optimization report. That is all we can recommend.

https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2025-0/optimization-reports.html

ravi_0602 · ‎12-16-2024

@Ron_Green

Thank you very much for this. I will have a look at it !!

How can the Fortran Performance improve by Precomputing Scalars used in an expression ??

Fortran Language