Hi,

sun__lei · ‎04-03-2020

I had compiled the same program using the two different parameters to compile.

1. icc -qopt-report -g -O2 MD.c util.c control.c coord.h -c -lm

2. icc -qopt-report -g -O2 -ipo MD.c util.c control.c coord.h -c -lm //add "-ipc"

Then I compared every corresponding .optrpt file from the two. The result is that all the contents are the same except the second's content is

-inline-max-per-routine: disabled

-inline-max-per-compile: disabled

while the first's content is

-inline-max-per-routine: 10000

-inline-max-per-compile: 500000

It seems that the two's performances will also be the same. But the amazing result is that the second is three times speed up than the first!

So what is the reason? Who can help me explain it?

RahulV_intel · ‎04-06-2020

Hi,

Could you provide us the ICC compiler version, OS, sample test case on which you have worked?

--Rahul

RahulV_intel · ‎04-27-2020

Hi,

Quick reminder to provide sample test case.

--Rahul

jimdempseyatthecove · ‎04-28-2020

>>But the amazing result is that the second is three times speed up than the first!

If I were to make a guess....

The first compilation was "inline everything that can be inlined".

The second compilation placed upper limits on the degree of inlining.

To a new programmer, when they discover that inlining can be good in one case, naively assume that inlining to the max must be better.

There are a few issues with over aggressive inlining (and loop unrolling)

1) The level 1 instruction cache has a limited size. A loop with several calls to the same function when inlined can produce a loop that spills out of the L1 instruction cache. Whereas the same loop with the function calls not inlined can produce a loop + function that fits within the L1 instruction cache. In the non-inlined case in this example will run faster than the inlined case.
2) overuse of inlining can at times result in over-subscription of the available registers.

You often need to be more judicious (less aggressive) in where you perform inlining and/or how/where you perform ipo.

Jim Dempsey

RahulV_intel · ‎05-11-2020

Hi Lei,

Kindly confirm if your query is resolved or else provide a sample test case so that we can get back to you with the actual explanation for such behavior.

--Rahul

RahulV_intel · ‎05-18-2020

We are closing this thread. Feel free to post a new question, if your issue still persists.

--Rahul

Why are the two run's qopt-report the same but the two run's performance are very different?