Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

How could I read msr BUS_TRANS_MEM, Windows

G__T
Beginner
1,059 Views

Hello,

I have been reading quite a lot of the intel documentations and this forum but I am quite stuck.

As a side small personal project I need to check the BUS_TRANS_MEM counter. I am currently running on a skylake proc.

I am very new to this kind of things but I have a small driver ready, can read core temps (for testing) through rdmsr but when it came to  BUS_TRANS_MEM I understood it was more complex.

So I jumped to the WinDbg and tried in a first step to determine how to find the desired msr address and I can't seem to determine it.

1. Depending on the doc I find eventsel 0x6f and the umask either being 0x40 / 0xE0 (all agends) , either 0xc0 in another doc.

2. It seems I have to compute these info with IA32_QM_EVTSEL but I really can't find the link between. Looks like I have to use a counter register (0x700 ?) to set up my counting on BUS_TRANS_MEM ?

So far I came with something like :

wrmsr 0xE01 0x0
wrmsr 0x700 0x50E06f (Why 0x50 ?)
wrmsr 0xE01 0x2000000f
wrmsr 0xE01 0x0
rdmsr 0x706

But it doesn't seem to be that at all ; I am a bit lost.

In the end my question is 'quite simple'. What is the way to access the memory bus transaction counter ?

Thanks for reading my issue :-)

 

 

 

0 Kudos
5 Replies
Thomas_G_4
New Contributor II
1,059 Views

I don't know which documentation you checked but I cannot find an event BUS_TRANS_MEM for hardware performance monitoring. Neither in the Intel SDM (January 2019) nor in the JSON document at https://download.01.org/perfmon/SKL/ (nor in the Skylake SP JSON documents). So, they are no hardware performance events.

Instead of the global performance monitoring control register (0xE01) and CBo0 event select register (0x700), you have to use IA32_QM_EVTSEL (0xC8D) and IA32_QM_CTR (0xC8E). The event id 0x6F has to be in bits 0-7 of IA32_QM_EVTSEL. Moreover, you have to specify the "Resource Monitoring ID" in bits N+31:32 where N = Ceil(Log2(CPUID.(EAX= 0FH, ECX=0H).EBX[31:0] +1)). Afterwards you can get the results from bits 0-61 from IA32_QM_CTR. There are some status bits in IA32_QM_CTR:
Bit 62 == 1 -> indicates data for this RMID is not available or not monitored for this resource or RMID.
Bit 63 == 1 -> indicates an unsupported RMID or event type was written to IA32_PQR_QM_EVTSEL

It seems there is no "start counting now" flag in any register. Check section 17.18.7 of SDM (Jan 2019) for further information.

So, in rdmsr/wrmsr calls:
wrmsr 0xC8D 0x0
wrmsr 0xC8E 0x0
wrmsr 0xC8D 0x10000006f  (assuming RMID = 1)
<sleep>
wrmsr 0xC8D 0x0
rdmsr 0xC8E

 

Just for completeness:
wrmsr 0x700 0x50E06f (Why 0x50 ?)
This configures the event 0x6F with umask 0xE0. The 0x50 are the bits for "count in user-space" and "enable counter". It depends on the value in 0xE01 whether the wrmsr starts counting in 0x706 directly or configures it only as  "ready to count". For counting the corresponding bit in the global config register (0xE01) needs to be set.

Best regards,
Thomas

0 Kudos
G__T
Beginner
1,059 Views

Hello Thomas, thanks a lot for this precise detailed answer. It's helping me to understand how it works.

I found the event I am interested in (BUS_MEM_TRANS) in the "Intel 64 and IA-32 ArchitecturesSoftware Developer’s Manual" PERFORMANCE-MONITORING EVENTS Volume 3B. (https://www3.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developers-manual.pdf) . Also found it in the more recent "Intel 64 and ia32 architectures performance monitoring events" documentation ( https://software.intel.com/sites/default/files/managed/8b/6e/335279_performance_monitoring_events_guide.pdf?ref=hvper.com ) .

So if I understood right, the event I would like to observe is not accessible through the IA32_QM_EVTSEL and IA32_QM_CTR registers ?

Does the fact BUS_MEM_TRANS can't be found in the JSON doc mean it does not exist anymore for the current platforms or does it mean it has to be read through another method ?

 

0 Kudos
Thomas_G_4
New Contributor II
1,059 Views

A quick search in both supplied documents returned zero results for BUS_MEM_TRANS.

You can only use the BUS_MEM_TRANS with the IA32_QM_EVTSEL and IA32_QM_CTR registers.

The event BUS_MEM_TRANS is not in any JSON document for any Intel architecture, so I assume it is not a performance event. It might be undocumented or simply named differently.

0 Kudos
HadiBrais
New Contributor III
1,059 Views

There seems to be some confusion here between the BUS_TRANS_MEM event and the Memory Bandwidth Monitoring (MBM) feature. The BUS_TRANS_MEM event is only officially supported on the Intel Core 2 microarchitecture. The event code is 0x6F and the umask specifies which memory transactions to count:

  • umask = 0xE0: Counts memory transactions of all types from all agents. Note that all the core that share the same L2 cache constitute a single bus agent. In addition, the south bridge is a memory bus agent. So the total number of agents is equal to the total number of L2 caches and the total number of south bridges.
  • umask = 0x40: Counts memory transactions of all types from the same core on which the counter is being read.
  • umask = 0xC0: Counts memory transactions of all types from the same agent on which the counter is being read.

Memory transactions from hardware prefetchers can also be filtered. For more information on the possible umask values for the BUS_TRANS_MEM event, refer to Section 18.6.1 of the Intel SDM Volume 3. Note that this event is also supported on the Bonnell microarchitecture.

The intended use of the BUS_TRANS_MEM event is to measure memory bandwidth, which I think is what you want to do. This can be achieved by multiplying the BUS_TRANS_MEM event count by the size of a cache line, i.e., 64 bytes, and dividing by execution time. Note that this method counts partial reads or writes as full 64-byte transactions. Note also that BUS_TRANS_MEM does count L2 cache writebacks (i.e., dirty evictions).

On later microarchitectures (Nehalem and later and Silvermont and later), the offcore response counting facility has mostly replaced BUS_TRANS_MEM, which I think is why BUS_TRANS_MEM was removed on later microarchitectures.

MBM is a better way to measure memory bandwidth because it gives more control to define the entity (e.g., core) that you want to measure the memory bandwidth of. There is no event 0x6F in MBM. Other than MBM and BUS_TRANS_MEM both measure memory bandwidth, they have nothing to do with each other. These are completely different features that are supported on different microarchitectures. The IA32_QM_EVTSEL register is used to select an MBM event to monitor and has nothing to do with BUS_TRANS_MEM.

MBM is part of the Resource Director Technology (RDT) monitoring features and is supported on Skylake-SP and Skylake-X. However, according to SKX4 and SKZ4, MBM does not count LLC writebacks on these processors. Therefore, it is disabled by default on Linux (all of the RDT features are buggy on these processors and are disabled by default). You can enable MBM using the rdt kernel parameter. Note that MBM works fine on Broadwell Xeon and Cascade Lake SP.

0 Kudos
G__T
Beginner
1,059 Views

Oops, yes indeed I made a typographical error. I meant BUS_TRANS_MEM.

Thomas G. wrote:

A quick search in both supplied documents returned zero results for BUS_MEM_TRANS.

You can only use the BUS_MEM_TRANS with the IA32_QM_EVTSEL and IA32_QM_CTR registers.

The event BUS_MEM_TRANS is not in any JSON document for any Intel architecture, so I assume it is not a performance event. It might be undocumented or simply named differently.

0 Kudos
Reply