Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Xeon, VNNI, and a Throttling

Aronj2
Beginner
322 Views

Hey, I've got a tricky performance snag on my new Intel Xeon cluster running that super-optimized HPC workload (using VNNI for INT8 gains).

It's fast, but sometimes it just stalls out. I think it might be due to cache-line contention.

So I've been wondering:

If I only had Intel V-Tune and access to raw PMT data, how to approach this?

  1. What are the top two or three PMU events to profile in V-Tune to confirm it's actually cache contention and not just bad instruction flow?

  2. How to use the PMT data (power, thermals, etc.) to definitively tell us, 'No, this isn't contention, the platform is just throttling the clock speed'?

  3. And finally, say the real fix is a vendor-pushed microcode patch for better VNNI instruction scheduling. How do I ensure that the critical little binary file is legit before the CPU loads it, touching on code signing and Intel TXT?

0 Kudos
0 Replies
Reply