- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to monitor the SoC uncore event groups detailed in:
https://software.intel.com/en-us/articles/baytrail-uncore-performance-monitoring-events
VTune Amplifier XE 2015 update 1 doesn't seem to list those event groups or the individual events shown within them. The VTune events do match those detailed in secton 18.6.2 of the Intel 64 and IA-32 Architectures Software Developer’s Manual (volume 3b, part 2), but they don't provide the same monitoring capabilities as the SoC groups.
How can I monitor the Baytrail SoC group events as shown in the link above?
I'm also using the Intel Performance Counter Monitor (pcm-* tools) on an embedded system running Yocto Linux. However, even the latest v2.8 only supports monitoring uncore events on Jaketown/Ivytown/Haswell processors.
Is there an update in the pipeline, or are the additional uncore MSR details available somewhere so that I may add them to my version? Or is there an alternative tool that I should be using instead?
Thanks,
Simon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
VTune Amplifier 2015 Update 2 should show you a tab for SoC Bandwidth that counts requests from all agents (processors, graphics, IO). This version also includes SEP/EMON and this is the tool you probably want to use for counting uncore events on Baytrail. I would recommend trying EMON with the -? option to list all available events. It should list several "UNC_SOC" events. If you see the event you want to sample then you can count it with "EMON -t 1 -C EVENT_NAME".
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
VTune Amplifier 2015 Update 2 should show you a tab for SoC Bandwidth that counts requests from all agents (processors, graphics, IO). This version also includes SEP/EMON and this is the tool you probably want to use for counting uncore events on Baytrail. I would recommend trying EMON with the -? option to list all available events. It should list several "UNC_SOC" events. If you see the event you want to sample then you can count it with "EMON -t 1 -C EVENT_NAME".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the details, and the new VTune update. I'll give EMON a try first to confirm it's showing the expected details.
I'm not sure if I'll need it yet, but is System Studio due a matching update 2 with the additional events?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Most of the module-based SOC event groups are working well with EMON, but the memory counters are always zero. The following groups are not working at all for me:
UNC_VISA_DDR_Self_Refresh
UNC_VISA_Memory_DDR0_BW
UNC_VISA_Memory_DDR1_BW
UNC_VISA_Memory_DDR_BW
The Bandwidth analysis test within VTune (update 2 added support for Silvermont) is also showing zero bytes/s for all tests.
I noticed there is a UNC_VISA_LowSpeedPF_BW group for low-speed fabric, but shouldn't there also be a UNC_VISA_HighSpeedPF_BW group for high-speed (PCI Express)?
Finally, are the register values for the SOC event group counters available, so I can monitor them from my own code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good! I'm glad to hear you were able to get some counts from some of the events with EMON.
Regarding memory event counts being zero; this means that the performance counters are powered off by your BIOS/Firmware settings. There is likely no way to change this, but there might be a BIOS menu option to set debug option to "PerfMode". You can get a very good idea of total memory bandwidth with event UNC_VISA_All_Reqs. This event will give the total request count from each agent (CPUs, IO, GFX). Sum them up, multiply by 64 bytes and divide by seconds sampled to get bytes per second.
Regarding LowSpeed vs HighSpeed; Baytrail only has a low speed peripheral fabric and no high speed fabric. All IO traffic goes through the low speed fabric including USB3, PCIE, SATA, GbE and so on. Only the micro-server product Rangeley contains the PCIE connected via high speed fabric.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If your BIOS does expose the PerfMode option, then it will likely be under a "Debug Configuration" menu and the specific option named "PDM/DFX Setting".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the updates, and confirmation that PCIe is part of the low-speed fabric on this BayTrail-I. I'd seen the Rangely diagrams, and assumed that it would be the same if it was present.
I have found UNC_VISA_All_Reqs to be useful for CPU-related measurements, though the Disp_Reqs / Imaging_Reqs / VED_Reqs sub-event counters within it are still always zero for me. I certainly expected to see something in VED_Reqs when I was actively decoding video through VAAPI. The other sub-event counters (Mod0_Reqs / Mod1_Reqs / GFX_Reqs / LowSpeedPF_Reqs) are working normally.
I do already have the "PDM/Dfx" BIOS setting set to "Perf", as without it I couldn't access any MSR values. Do you know if there are different levels to the perf mode that might need to be enabled, or should it be all or nothing? I can chase up the BIOS vendor (Insyde) if it's possible that additional bits need to be set somewhere.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are different levels to the perf mode in regards to the uncore events, but the "perf mode" is the correct setting to enable both the system agent events and the memory controller events. Are you still seeing zero bandwidth on DDR? If so, you could try PDM/DFx set to "on".
For UNC_VISA_All_Reqs I would expect you to see counts for Disp_Reqs if your monitor/screen is turned on and it should increase if you attach additional monitors. Imaging_Reqs should be zero unless you have the camera active. VED can be hard to enable and requires a driver. If the correct driver is not enabled then the encode/decode will be done by the CPUs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the update, Simon is on holiday this week.
We have tried the different permutations of PDM/DFx in the BIOS; "PDM On", "Perf Mode" and "Debug Reserved" and get the same results in all cases. Using EMON, the majority of the counters are working but none of the UNC_VISA_DDR* or UNC_VISA_MEMORY_DDR* are working.
VED is also not working. We know that we are using the video decode engine to decode, using open-source Linux graphics drivers and VAAPI. Interestingly the GFX_Reqs / GFX_Read64B / GFX_Write64B counters do show activity when we are decoding, even if we are not displaying the decoded images or using the GPU to transfer them in any way.
Fundamentally, we would like to know how the Performance Counter are enabled and what we can read to verify that they have been enabled correctly, then we can present our findings to our BIOS supplier and get the BIOS changed if required. We have an RSNDA in place.
On a related note, we are interested in performance counters related to the PCIe interface, do such counters exist?
Is there a document that lists the Uncore counters in more detail?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Richard,
For any NDA and BIOS setting specific conversations we would need to move to a private email or phone conversation.
Regarding PCIe interface bandwidth, there are no publically available counters for specific blocks in the south cluster. The closest metric is the aggregate bandwidth of all south cluster traffic.
On the VED topic, it sounds like the driver is offloading the encode/decode to the GFX unit rather than the VED unit. My understanding is that the Baytrail VED unit only supports VP8. Is that the format you are testing?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Perry,
Thanks for getting back to us so promptly.
Our FAE has opened up a Premier Support issue (ID 6000090586) now that we have an RSNDA. Is it possible for you to reply via that or do we need to open up another one?
We understand about the VED now. We were looking for some metrics that differentiated between the GPU "proper" and the MFX (H.264 video decoder). We thought that the VED was another acronym for that, rather than just the VP8-specific decode unit. We can see that there are registers internal to the MFX that give some performance metrics but it would be great to get a list of all of the UNCORE Performance Counters.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, I am not in the team that handles the Premier issues but I'll contact that team to help support the request.
Thanks,
Perry
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page