Showing results for 
Search instead for 
Did you mean: 

NUMA_MISS numbers not changing in numastat


I am running my application on Xeon Phi processor configured in SNC-4+Flat mode. My application is trying to capture local and far memory latency. I am running my C program as "numactl --membind 7 --cpubind 0 ./myperf". I am expecting that this should change numa_miss numbers in numstat utility. But I see that there is no change in numa_miss. I am accessing memory, not in the same node so why am I not getting any numa_miss?



0 Kudos
1 Reply
Black Belt

The statistics from numastat don't mean what you might think they mean.

In particular, "numa_miss" means that the operating system was unable to allocate a page in the domain where it was requested.   You requested data placement in NUMA domain 7 and the operating system was able to allocate the page there, so the "numa_hit" statistic was incremented, not "numa_miss".

One way to see "numa_miss" on your system is to attempt to place more than 4 GiB on one of the four MCDRAM domains, using the "-preferred" option to numactl.   If you try to allocate 5 GiB, for example, you should get 4 GiB of "numa_hit" (1,048,576 increments with 4KiB pages or 2048 increments with 2MiB pages), plus 1 GiB of "numa_miss" (262,144 increments with 4KiB pages or 512 increments with 2MiB pages).

If you want to measure cross-domain bandwidth, then you need a tool that measures bandwidth, not page allocations.  Intel's VTune and APS should be able to do this.   It may also be possible with "perf stat", but it is a lot of work to figure out how to set up all the counters with this tool.