Community
cancel
Showing results for 
Search instead for 
Did you mean: 
meldaproduction
Beginner
65 Views

/Qipo seems disabling automatic vectorization

Hi,

I depend a lot on SSE/AVX auto-vectorization and it seems that /Qipo disables it. These are relevant parameters I'm using:

/arch:SSE2 /QxSSE2 /Qvec-report /QaxAVX /Qftz

Tye compiler reports lots of loops being vectorized. But if I add /Qipo, it states that the messages will be generated by linker (makes sense), but the linker reports nothing... (I'm not adding /Qvec-report to it though, doesn't seem logical anymore)

Thanks!

0 Kudos
6 Replies
TimP
Black Belt
65 Views

You would still need vecreport to see the messages. In past compiler versions, IPO might still have suppressed them.

I have concern about issuing redundant or conflicting arch options but that doesn't appear to be the problem.

meldaproduction
Beginner
65 Views

I'm using the newest IC version. Should I add the /Qvec-report option to the linker or what is the idea?

Kittur_G_Intel
Employee
65 Views

Hi,
If you specify both the /Qax and /arch options, the compiler will not generate Intel specific instruction. Try using just the /QxSSE2 (which is default BTW) and use the "/Qopt-report-phase:vec /Qopt-report-file:stdout"  to see the vectorization output accordingly.

_Kittur

Kittur_G_Intel
Employee
65 Views

Hi,
Could you let me know if you're able to see the vectorization report? Thanks
_Kittur

meldaproduction
Beginner
65 Views

Hi,

actually I didn't check, because I gave up on this feature. I'm compiling huge source codes anyway (kind of alternative approach having most things included in a single source code, usually makes the compilation time faster and provides these optimizations automatically). Anyway this /Qipo had no improvement in performance. I also tried profile guided optimizations and these actually made the performance worse...

Kittur_G_Intel
Employee
65 Views

Hi,
With IPO optimization it's a two step process for the compiler, first generating the intermediate language (IL) in the object files (mock objects) and at link time it's invoked again to figure out options used before and then merge all the IL in the object files and analyzed for IPO opportunites. This means that the link step will take a while, because the entire program is being examined and hence the build time can be greater. To avoid large build times, try to use IPO on performance critical files/libs only and avoid using it on all the files. 
If there's a lot of inlining and better register usage opportunities in the code it might boost performance with some tradeoffs on compile time.

 

With PGO it's dependent really on whether the application has a high number of performance critical small sections of code that's executed very frequently which the compiler tries to optimize accordingly. 

Also, with the advent of newer processors with a rich set of vector extensions in the instruction set, you should try and see if vectorization using avx/avx2 etc., (depending on whether the system you're running on supports or not) to exploit data parallelism that should increase performance. You can generate the optimization reports accordingly for those phases (ipo, vec, hlo etc) and check out what optimizations were made if any as well. 

_Kittur

Reply