Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

[PCM] Adding extra events "MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM" and "MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM"

wang__chi-lung
Beginner
653 Views

I am modifying PCM to monitor other events that are not included in current release. 

More specifically, I am interested in "MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM" and "MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM".

The "Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B & 3C): System Programming Guide" comments "Disable BL bypass and direct2core (see MSR 0x3C9)".

I assume that I have to disable BL bypass and direct2core first. Then, I can obtain the values of these events.

However, I did not find anyplace that gives instructions to disable BL bypass and direct2core. 

I would like to know:

1. Do I need to disable BL bypass and direct2core for getting the values of "MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM" and "MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM"?

2. If the answer is yes to 1, how do I do that?

I appreciate any information and guidance to answer these questions. 

Thank you very much.

Ron C. Chiang

0 Kudos
1 Solution
Roman_D_Intel
Employee
653 Views

Please see this article describing the workaround: http://software.intel.com/en-us/articles/performance-monitoring-on-intel-xeon-processor-e5-family

Intel PCM implements it in PCM::enableJKTWorkaround function.

Roman

View solution in original post

0 Kudos
11 Replies
Bernard
Valued Contributor I
653 Views

Have you looked at description of MSR 0x3C9 does it contain some info on how to disable BL bypass and direct2core?

0 Kudos
wang__chi-lung
Beginner
653 Views

Thank you for your reply,

I have checked the description of MSR 0x3C9 in the manual, and searched "BL bypass" and "direct2core" in both the manual and internet.

Unfortunately, I did not find any clear information.

Is there any additional document for MSR?

0 Kudos
Bernard
Valued Contributor I
653 Views

>>>Is there any additional document for MSR?>>>

I am afraid that beside the SDM documentation there is no freely available more in-depth documentation.

0 Kudos
wang__chi-lung
Beginner
653 Views

Thank you for your reply,

Does that imply the answer is in some documents that are not freely available?

Also, do I need BIOS support to disable BL bypass and direct2core? If not, can BL bypass and direct2cire be disabled by setting MSR?

Thank you very much!

0 Kudos
Bernard
Valued Contributor I
653 Views

I suppose that more advanced technical documentation will be available for Bios vendors based upon some kind of NDA agreement.

Unfortunately I do not have an answer to your last question.

0 Kudos
Roman_D_Intel
Employee
654 Views

Please see this article describing the workaround: http://software.intel.com/en-us/articles/performance-monitoring-on-intel-xeon-processor-e5-family

Intel PCM implements it in PCM::enableJKTWorkaround function.

Roman

0 Kudos
Bernard
Valued Contributor I
653 Views

Hi Ron

where in the manual is description of MSR 0x3C9? 

0 Kudos
wang__chi-lung
Beginner
653 Views

Sorry for the late reply and Thanks for the response from iliyapolak and Roman.

iliyapolak,

I can only find Table 18-39 and 35-18 (in the manual Vol.3) mentioned MSR 0x3c9. But, I did not find any explanation there.

Roman, Thank you for the hint. 

My machine is Xeon E5-2620. PCM detects it as "Intel(r) microarchitecture codename Sandy Bridge-EP/Jaketown"

After seeing your reply and trace the code again, I found that workaround is already enabled. So, I started to debug my modification.

The original pcm.x monitors 4 events on this machine. My first attempt was extending "coreEventDesc" in cpucounters.cpp. For example, I add the following code for one extra event. 

"coreEventDesc[4].event_number = MEM_LOAD_UOPS_LLC_MISS_RETIRED_LOCAL_DRAM_EVTNR;

coreEventDesc[4].umask_value = MEM_LOAD_UOPS_LLC_MISS_RETIRED_LOCAL_DRAM_UMASK;"

I also changed corresponding numbers, e.g, core_gen_counter_num_used = 5; //It was 4 before the modification.

However, I never get a valid number from this implementation. 

Then, I tried another way today. I change one of the 4 originally mornitored events to the one I want, instead of adding extra one to "coreEventDesc".

It works. Now I can collect other events.

I am happy to have this problem solved. But, I have one more question. pcm.x shows that "Number of core PMU generic (programmable) counters: 8". So, I thought that I have 8 counters to use and started my first attempt. However, it seems there are only 4 available?

Ron

0 Kudos
McCalpinJohn
Honored Contributor III
653 Views

The Sandy Bridge core has 8 programmable performance counters, but these are split into two groups of four when HyperThreading is enabled.

0 Kudos
wang__chi-lung
Beginner
653 Views

Thank you, John,

I did disable HyperThreading on this machine.

0 Kudos
Roman_D_Intel
Employee
653 Views

Hi,

when Hyperthreading is disabled there should be 8 programmable counters in hardware on your system. 4 hardware programmable counters are available if Hyperthreading is enabled. Intel PCM currently supports 4 programmable counters. We might address this in a future version of Intel PCM.

Roman

0 Kudos
Reply