Hi,
I am wondering why this a function's GFLOPS changed in the roofline, when I select the "With callstacks" feature.
Here is the screenshot without selecting the "callstacks".
And we see that the red dot, which is the bgk function, shows it has 2.39GFLOPS.
Now after selecting the "With Callstacks", I find that the "bgk" function becomes 0.003 GFLOPS
What does this mean?
Link Copied
OK, I see it shows the Total GFLOPS = Total GFLOP / Total Elapsed time
For more complete information about compiler optimizations, see our Optimization Notice.