Does Memory Bandwidth Monitoring technology collect "per numa node" bandwidth usage info or it's just local and all the remote nodes?
I don't think you can measure bandwidth per NUMA node using MBM because there is only one Event ID (0x02) for the total external bandwidth and Event ID (0x3) for the local memory bandwidth. However, there is one PMON box per QPI/UPI link. You can use the data flit events to measure data QPI bandwidth between any two specific sockets as follows:
Incoming data bandwidth = (RxL_FLITS_G1.DRS_DATA + RxL_FLITS_G2.NCB_DATA) * 8 / time
Outgoing data bandwidth = TxL_FLITS_G0.DATA * 8 / time