What can be expected with regards to inlining of type-bound procedures, and especially what are the differences between single- and multi-files IPO?
For instance, is there any inlining with single-file IPO outside of the module where the procedure is implemented? Or is multiple-files IPO then needed?
With regards to polymorphic objects, can the compiler inline their type bound procedures? I realize it may be difficult, but I guess that if the compiler knows the dynamic type of an object (e.g. because it has been allocated a few lines before), it may be able to figure out what is the exact type-bound procedure to inline. In this context, does NON_OVERRIDABLE makes any difference? I would have expected, in a long class hierarchy, if a type-bound procedure is NON_OVERRIDABLE in the base class, that the compiler should be able to inline it. This may be especially relevant for implementing getters procedures.
Type-bound procedures are optimized just like any other procedure if the compiler can determine which procedure to call at compile-time. If the called procedure is in the same source file as the caller, then the default inline optimization is available. If it's in a separately compiled source, you will need multi-file IPO.
For polymorphic references where the selection of routine is done at run-time, no inlining is possible. I haven't done the experiments to see if NON_OVERRIDABLE helps at all here. If you see a direct call to the procedure in the generated code, then inlining is possible, otherwise not.
Interesting. I did a few tests, and the optimization report claims to be doing inlining of all type-bound procedures, including those called with a polymorphic reference. Of course the test code is very simple and it is possible that the compiler can figure out the actual dynamic type pretty easily.
On my actual code base, which is very large and complex, this is not the case. It seems also on this code that NON_OVERRIDABLE makes no difference. I may still use it as it is a good documentation of a class interface, with the hope that the Intel compiler may be able to pick up this information for optimization.
Unfortunately I cannot use multi-files IPO on my actual code: there is always an internal compiler error (segmentation violation). Problem is: I failed until now to create a simple test reproducing the problem... Are there any ways to pinpoint where the problem may be (such as some stack trace)? Commenting out a *lot* of code allows me to compile with multi-files IPO, but when removing the comments there is always a point at which the internal error comes back. Does having smaller source files or submodules could help the compiler?
If you can't easily reduce the test case, please submit the original program (sources and build instructions) through Intel Premier Support and we'll take it from there. An internal compiler error is always a compiler bug and the sooner we can reproduce it in-house. the sooner we can make the fix available to everyone. We have internal tools that can help us identify the problem.