I'm trying to isolate a ~20% performance degredation that I see when compiling my program with O2 vs O1. So far I've tried setting -unroll0, -ob2 but I'm having trouble locating the rest of the implied flags for O2 from the compiler documentation on the Linux IA32 platform
anyone that can point me to a reference or fill me in would be appreciated!
I assume you mean some version of icc or icpc. If your application is such that -O1 performs consistently better than -O2, you probably want that option anyway. -O2 sets /Qvec, which is likely to be counter-productive if your program spends no time running such for loops with counts > 30 or so. It also sets /Ob2 in recent versions; the default level has varied among releases. Perhaps you want to turn off in-lining entirely, or set it down to /Ob1 (inline only where inline keyword is set).