for some of my workloads I'm seeing high ILD_STALL.ANY counter values - ~20% of total unhalted CPU cycles (.THREAD). And ~95% of these stalls come from ILD_STALL.IQ_FULL, whereas ILD_STALL.LCP is all zeroes. I seem to fail to find any info on how harmful ILD stalls are when the instruction queue is full (it does not sound half as harmful as "instruction queue is empty" would :) ), and I'm curious how to interpret these high counter values - what is the performance problem they are signalling me about and which other counters can confirm the diagnosis? I'm running on a NHM processor. Thanks!
It seemed that this issue was due to instruction decode queue is full - I don't know why, usually it was due to long length instruction decoding, and should split a complex instruction to multiple simple instructions. You might review your associated code for investigating.
The idea is to use Intel? C/C++ compiler with advanced optimization options to generate efficient code - use VTune™ Amplifier to compare result with old one.