Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

How can i building Intel PCM without NMI watchdog check?

Wenqin_C_
Beginner
618 Views

Since PCM 2.7, pcm check NMI watchdog disable or not first in linux system. So NMI watchdog must be disable before running pcm.x. 

Is it possible to build pcm.x without checking NMI watchdog? And how to ?

What is  'Prevent metric value corruption by Linux NMI watchdog' mean?  Is there any detail information?

I just use pcm.x for gathering CPU's realtime frequency.  And i don't want to disable NMI watchdog because i have no idea about what will happen after disable it.

0 Kudos
3 Replies
Thomas_W_Intel
Employee
618 Views

You can disable the check, but this check is there for a reason. PCM uses the performance counters to monitor the hardware. The NMI watchdog uses the same unit. That's why they can't run at the same time.

The NMI watchdog is a Linux feature to watch if your system becomes unresponsive (and then take counter actions). Details can be found in the Linux documentation.

0 Kudos
McCalpinJohn
Honored Contributor III
618 Views

There is nothing to prevent multiple processes from *reading* the performance counters -- I do this all the time.

The NMI watchdog does more than read the counters -- it programs a counter to overflow on interrupt, and then installs an interrupt handler to respond to that interrupt.  The NMI watchdog typically resets the "CPU cycles not halted" counter (using either the fixed-function counter or one of the programmable counters) to (1<<48)-(1<<31), so that it will overflow every ~2 billion cycles.   If you are reading the same counter and doing it much more frequently than once every 2 billion cycles, then most of your computed differences between counter readings will be valid.  I simply discard any results for which the counter appears to be decreasing (since that means the NMI watchdog reset the counter during my measurement interval).

Intel performance tools typically also use the "generate interrupt on counter overflow" feature to sample application performance.  When the NMI watchdog is running you now have a problem because two different processes want to program the performance counters to generate interrupts on overflow and they want respond to those performance counter overflow interrupts in different ways.

It might be possible to build a single service routine that checks the performance counter overflow flags to determine which counter overflowed, and then take different actions depending on which counter overflowed -- e.g., if the counter in use by the NMI watchdog overflows, then execute the NMI watchdog's routine, otherwise execute the routine provided by VTune or PCM or whatever is appropriate.  I don't think that the Linux kernel provides the facilities to support this mode of operation.
 

0 Kudos
Wenqin_C_
Beginner
618 Views

Thomas Willhalm (Intel) wrote:

You can disable the check, but this check is there for a reason. PCM uses the performance counters to monitor the hardware. The NMI watchdog uses the same unit. That's why they can't run at the same time.

The NMI watchdog is a Linux feature to watch if your system becomes unresponsive (and then take counter actions). Details can be found in the Linux documentation.

Is there any other way to disable the check without modify the PCM's source code? And what will happen when they run at the same time? If running  PCM without disable the NMI watchdog will make my system unstable(e.g, a fatal error like kernel panic) or not ?

0 Kudos
Reply