Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29277 Discussions

documentation regarding optimization reports?

Alexis_R_
New Contributor I
959 Views
I am trying to track down a bug which occurs with -O3 and -O2 but not -O1. I thought I would try to narrow things down by using !DEC OPTIMIZE directives in the code, which I haven't really done before, and then check if the bug is still there.
I have a few basic questions, I would appreciate pointers to an overview on the topic if there is such a thing.
- How do I parse the optimization reports? (for example, in the inlining report, how do I know which routines were inlined?)
- Can I easily confirm from the optimization report whether my in-source directives had the effect I was looking for? (i.e. modules, routines and loops were optimized at the level I think my directives indicate)
- In general, is there some knowledge base article or documentation page with advice on tracking issues with optimization?
I apologize if these are FAQs.


Edit: also, this particular bug seems to go away with -g, even in combination with -O3. Any advice would be appreaciated. I have already run the Inspector XE on my code (not the exact same reproducer, because that would take too long), which didn't turn up any memory issues.
0 Kudos
7 Replies
Steven_L_Intel1
Employee
959 Views
The optimization reports don't go into the level of detail you'd need for such a thing. I'd encourage you to submit a problem report, either by describing the problem more completely here (and attaching an example), or through Intel Premier Support. Internally we have additional diagnostic tools we can use to narrow things down.
0 Kudos
Alexis_R_
New Contributor I
959 Views
Thanks Steve,
I would have liked to narrow things down myself a little bit more before I try to submit a report to you guys. At the moment, I would have to send you a whole lot of input data to reproduce it (~200MB), my whole code base, and it would still take ~6 minutes of run time to reproduce. If that's acceptable to you, and it can be kept private, let me know.
In any case, I would love to find a work around which doesn't involve compiling _everyting_ at -O1. If I could just use directives to keep most things at -O3, and the problematic part of the code at -O1, that'd be great. Any pointers?
0 Kudos
Steven_L_Intel1
Employee
959 Views
Please use Intel Premier Support for this, The 200MB of data may be a sticking point, though. Really, at some point you'll need to provide us the code so we can reproduce the problem and fix it if it is our bug.

For now what I suggest is using "divide and conquer" to determine which sources need to be compiled at -O1 in order to avoid the problem. This might instead lead you to discover a programming error that makes invalid assumptions.
0 Kudos
Alexis_R_
New Contributor I
959 Views
It is quite possible this is a bug on my end.

The "divide and conquer" approach sounds great. I had thought I could do this by adding "!DEC$ OPTIMIZE 1" statements at the top of my modules, say, rather than changing the compiler option from -O3 to -O1. Is that not correct? What takes precedence, the command-line option or the in-source directive? I had imagined the advantage of directives is that the division & conquest can become more fine-grained...
0 Kudos
TimP
Honored Contributor III
959 Views
You should be able to use the inline directives to control the -O level. It's often easier to build a full set of .o files for each optimization level and perform a bisection search linking groups of each, but the tactic is your choice.
You might also consider setting
-assume protect_parens -prec-div -prec-sqrt
in case one of those can improve your numerics at all -O levels.

0 Kudos
Steven_L_Intel1
Employee
959 Views
Directives override command options.
0 Kudos
Alexis_R_
New Contributor I
959 Views
Turns out-assume protect_parens -prec-div -prec-sqrt in combination with -O3 appears to give the same behaviour as -O1.

I have submitted an issue to Premier Support with a full reproducer. I am really not sure whether this is an issue with the optimization, but if that could be clarified, it'd be great.
Thanks for your advice
0 Kudos
Reply