Itanium: Is it possible to sample INST_RETIRED excluding nops?
another question occurred to me:
I want to measure how much throughput of *real* instructions my program has (that is I don't want to count nop instructions). Then I could guess how good the compiler was able to fill the IA64 bundles, that is how much is done in parallel. (I know that stalls may distort the result, but I plan to figure that out later :)
So, does anybody know if there's a way to count retired instructions without nops on IA64 with VTune?
Hello Andreas, There is no counter to allow you to sample instructions excluding no ops. However, check to see if you have an event called NOPS_RETIRED. If your IA64 processors supports this, then you can sample both IA64_INST_RETIRED and NOPS_RETIRED and the difference is your "useful" instructions retired. For more information, see the Intel Itanium 2 Processor Reference Manual here. Thanks, Shannon