I'm currently using icc version12.0.4 20110427. The loop blocking optimization in icc is provided by the flag -opt-blockfactor=n. When n is not specified, the default heuristics are taken. What are the defalt heuristics in the case of n ? i.e, what is the default loop blocking factor ?
a) There is no default loop blocking factor. The compiler's heuristic might decide on case-by-case what's best for optimization. Hence it might also define the blocking factor for each loop inside the same compilation unit differently. If you don't specify the -opt-blocking-factor=n option at all, the heuristic will be used.
b) Using the blocking factor options on the other hand unveils some problems: After running some tests I don't see that a manual blocking factor does take effect. Also there is no way to switch back to default heuristics explicitly, e.g.: icpc -opt-block-factor=123 ...... -opt-block-factor=back_to_default_heuristics According to the documentation the compiler may ignore the blocking factor for values 0 and 1. This does not enable the default heuristics again but might turn off loop blocking entirely. Some heads-up: If you try to specify "-opt-block-factor=" or "-opt-block-factor" on Linux* it gets interpreted as "-o pt-block-factor=" or "-o pt-block-factor" respectively. This is wrong and not what you'd expect. Anyways, I've filed a defect (DPD200272929) to clarify the odds of b). I'll come back to you once I've more information.
Thanks for your reply. In gcc, as of GCC 4.6, the blocking factor is hardcoded in the params.def file present in gcc folder. The defalut blocking factor is 51 as of gcc 4.6.2 and gcc does not use any heurestics to determine the optimal blocking factor. Strange !!!
Similarly, I wanted to know what is the case wrt to icc, and if it uses any heurestics, is it possible to know what are they ? I require it for pure academic research purpose as I'm working in the area of compiler code optimization.
Blocking factor is very inportant in the case of loop tiling as it determines the utilizaion of cache. I wanted to know what exactly is the blocking implementation and loop blocking heurestics for some compatitive analysis.
It would seem to me that a good way to implement heuristic loop blocking would be via an optional #pragma
#pragma opt heuristic_loop_blocking for(...
Where at programmer's choice additional overhead is inserted into the code to measure the affects of varying the blocking factors for the loop. In subsequent runs of the section of code the prior information would be used to make the adjustments. The tuning process could be set to turn off after n iterations or some small delta is reached. There would remain issues as to tuning parameters verses size of iteration space(s) of following loop(s). These could be factored into the heuristic tuning code.
Note, also as #pragma, a compile time option could be invoked to compile two variants of the code a) use run time heuristics to determining set points (saved to file on program exit) b) use previously saved set point from file run with setup a) in generating code blocking factors for that specific loop.
The developer would compile the code in Release mode (full optimizations) with /heuristic_loop_blocking:generate
Make one or more test runs
Then compile in Release mode with /heuristic_loop_blocking:use
Something like this might be beneficial to the programmer.
Another programmer productivity feature that could be added to IVF to distinguish itself.