First at all, I recommend this article for your reference.
If you set NUMA on in BIOS, so associated performance event counts can be used:
Above indicates memory access for all offcore cacheline traffic. There are similar events can be used:
Additionally the article provides many latency info (penalty) for offcore memory access
To evaluateData Latency Analysis Ratios caused by "Remote DRAM", the formula is:
"LLC Load Driven Misses - Remote DRAM" = 275 * MEM_UNCORE_RETIRED.REMOTE_DRAM / CPU_CLK_UNHALTED.THREAD
About using performance counts on VTune AmplifierXE 2011 Update directly (command line) - please refer to this article.