I would like to understand better an impact of DDIO hit/miss on data delivery latencies from network card over PCIe bus. My understanding is that PCIeItoM event can be used to track occurrences of data placement directly in L3. I downloaded Intel PCM v.2.11 and built on RHEL 6.5. However it seems that event selection for Haswell CPU has changed and in this version of pcm-pcie.x PCIeItoM event is not included:
"Detected Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz "Intel(r) microarchitecture codename Haswell-EP/EN/EX"
Update every 1 seconds
Skt | PCIeRdCur | RFO | CRd | DRd | ItoM | PRd | WiL
I modified the code to enforce former (I believe) behaviour, but counters don't seem to tick (ran on the same CPU) despite NIC receiving data:
Skt | PCIeRdCur | PCIeNSRd | PCIeWiLF | PCIeItoM | PCIeNSWr | PCIeNSWrF
0 806 K 0 0 0 0 0
The same behaviour is observed on machine with this CPU:
"Detected Intel(R) Xeon(R) CPU E5-1680 v3 @ 3.20GHz "Intel(r) microarchitecture codename Haswell-EP/EN/EX"
Also event description (from pcm-pcie.x) for 'ItoM' says 'ItoM - PCIe write full cache line' - is this correct? Intel Xeon E5 and E7 v3 Family Uncore Performance Monitoring Reference Manual says:
"Request Invalidate Line
On three different machines with Xeon E5 Haswell CPUs I didn't observe PCIeItoM event ticking. Could you please advise on what if PCIeItoM is the right event to monitor? And why is it no longer included in an output of pcm-pcie when Haswell CPU is detected?
Thank you in advance!
I have the same question - pcm's (Intel® Performance Counter Monitor) pcm-pcie.x.
The tool's documentation mentions events such as PCIeItoM, PCIeNSWr and PCIeNSWrF which are missing from the output when running on Haswell servers.
I can see in the code that these events and others are disabled on Haswell and Broadwell, but can't understand why, the events are documented in the uncore documentation for these CPUs.
From E5-2600v2 (Ivytown) to E5-2600v3/v4 (Haswell/Broadwell) there were some internal flow changes, so the performance counter events also tag to different opcodes. To summarize the most commonly used events (i.e. BIOS default settings), here are a list of useful metrics.
Inbound read (PCIe devices read from system memory):
Inbound write (PCIe devices write to system memory):
E5-2600v3/v4: ItoM (when write is full cache line), RFO (when write is less than a full cache line)
Outbound read (CPU reads from device memory (MMIO read)):
Outbound write (CPU writes to device memory (MMIO write)):
Thanks for the reply.
I'm specifically interested in non-allocating writes from PCIe devices to main memory, these are PCIeNSWr and PCIeNSWrF.
And my question is why are they missing from the 'pcm-pcie.x' tool when running on Haswell/Broadwell?
Based on your reply, it seems that more monitoring events were added, not removed from newer CPUs.
How do you suggest I monitor these?