after doing a Nehalem General Exploration of my code, the results show several "LLC Load Misses Serviced By Remote DRAM" events. I have measured this on a Xeon X5650 platform with only 1 CPU (6 cores, 12 incl. hyper threading), so my question is: how can there be a "remote" DRAM if there is only 1 CPU? As far as I know, "remote DRAM" means a piece of memory which is attached to another CPU and thus requires a QPI transfer.
Yes, the remote DRAM accesses, if there were a significant number, should indicate access to memory banks on another CPU. Not knowing the details of how VTune counts these, I have to fall back on the general observation that VTune isn't very useful in diagnosing events which have less than 5% performance impact.
If your VTune analysis shows a major impact of false events, I would look for a problem in the setup, including possibility that your search paths aren't picking up the same build which is being run.
Search paths are OK.
How can I determine if the impact of those events is high? VTune visualizes those blue bars, but doesn't show me an amount of cycles for that "LLC Load Misses Serviced By Remote DRAM" measure.