Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Is did of Device 5 Function 0 on v4 wrong?

GHui
Novice
668 Views

I get did of Device 5 Function 0 on v4 from xeon-e7-v4-datasheet-vol-2.pdf, that is 0x2F28. But I test cpu model 0x4F, that is 0x6F28. 

0 Kudos
6 Replies
Thomas_G_4
New Contributor II
668 Views

I'm not sure about device 5 function 0, but all PCI-based performance monitoring units have a device number of 0x6Fxx for model 0x4F instead of 0x2Fxx like documented, so I assume your finding is right.

0 Kudos
McCalpinJohn
Honored Contributor III
668 Views

The DID values for some of the IMC units are also incorrect in the documentation for the Xeon E5 v3 processors.  

Here is what I found for Xeon E5 v3 processors with 2 Home Agents and for Xeon E5 v3 processors with 1 Home Agent (8-core and below).

Mem Chan PCI    Expect  Actual Actual
Ctrl            DID     2HA DID 1HA DID
 0   0  7f:14.0 0x2fb4  0x2fb0  0x2fb0
 0   1  7f:14.1 0x2fb5  0x2fb1  0x2fb1
 0   2  7f:15.0 0x2fb0  -N/A-   0x2fb4
 0   3  7f:15.1 0x2fb1  -N/A-   0x2fb5
 1   0  7f:17.0 0x2fd4  0x2fd0  0x2fd0
 1   1  7f:17.1 0x2fd5  0x2fd1  -N/A-
 1   2  7f:18.0 0x2fd0  -N/A-   -N/A-
 1   3  7f:18.1 0x2fd1  -N/A-   -N/A-

"-N/A-" means that the PCI device was not present.

I am not sure why the single-HA device has a PCI device for MC1 -- I did not look to see if it did anything....

0 Kudos
GHui
Novice
668 Views

Is that mean I need to sum Ctrl 0 Chan 0,1,2,3 and Ctrl 1 Chan 0 to calculate Memory Bandwidth with 1HA?

0 Kudos
Thomas_G_4
New Contributor II
669 Views

No, the HAs don't know about the channels of the memory controllers, so the HA events UNC_H_IMC_READS.NORMAL and UNC_H_IMC_WRITES.ALL cover all memory channels of the attached controller(s). It might be that you additionally have to measure the HA event UNC_H_BYPASS_IMC.TAKEN to get accurate numbers because bypasses are not counted by the other two events.

0 Kudos
GHui
Novice
669 Views

I use IMC Performance Monitor to get Memory Bandwidth.

There is  MEM_BW_TOTAL=MEM_BW_READS + MEM_BW_WRITES

                                             =(CAS_COUNT.RD * 64) + (CAS_COUNT.WR * 64)

On E5-2630 v4 platform (2 socket, 10 core), there are 7f:14.0, 7f:14.1, 7f:15.0, 7f:15.1, 7f:17.0 and ff:14.0, ff:14.1, ff:15.0, ff:15.1, ff:17.0 Memory Controller Channel.

But On E5-2680 v4 platform (2 socket, 14 core), there are 7f:14.0, 7f:14.1, 7f:17.0, 7f:17.1 and ff:14.0, ff:14.1, ff:17.0, ff:17.1 Memory Controller Channel.

Does E5-2630 sum all 10 Memory Controller Channel to caculate Memory Bandwidth?

And E5-2680 sum all 8 Memory Controller Channel to caculate Memory Bandwidth?

0 Kudos
McCalpinJohn
Honored Contributor III
669 Views

On some systems there are PCI devices shown by "lspci" that do not correspond to hardware that is actually present.  It looks like that might be the case on your Xeon E5-2630 v4 platform.

The Xeon E5-2630 v4 is probably a 10-core die, so it will have only one memory controller, and all four DRAM channels will be on that controller.  From my reading of the Xeon E5 v4 uncore performance monitoring guide, I expect those to be 7f:14.0, 7f:14.1, 7f:15.0, 7f:15.1 on the first socket.  These four devices are channels 0,1,2,3 on memory controller 0.   The 7f:17.0 device would correspond to memory controller 1, DRAM channel 0, but I don't think that exists on this system.

Everything should be the same on the other socket, but with bus 7f replaced with bus ff.

The Xeon E5-2680 v4 is based on the 15-core die, so it has two memory controllers with two DRAM channels each.  14.0 and 14.0 correspond to DRAM channels 0 and 1 on memory controller 0.  The other 2 DRAM channels are on memory controller 1, using devices 17.0 and 17.1.  


 

0 Kudos
Reply