Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring
1628 Discussions

Does disabling H/W prefetching affect prefetch/prefetchw instructions?


According to

We can disable H/W prefetching through MSR. I'm wondering will it affect prefetch/prefetchw instructions (e.g., prefetch/prefetchw will no longer work)?




0 Kudos
2 Replies
Black Belt

Disabling hardware prefetch via the MSR controls does not disable software prefetch instructions.

Software prefetch instructions are very seldom automatically generated by compilers except when generating code for first generation Xeon Phi (Knights Corner) systems, where they are typically essential for good performance.

Software prefetch instructions have some advantages over loads -- they can be retired immediately, rather than waiting for the data to arrive, so they don't cause the out-of-order execution mechanisms to "back up".  On the other hand, software prefetches use the same Line Fill Buffers as ordinary loads, so they do not provide additional maximum concurrency.  (Carefully placed software prefetches can increase *effective* concurrency in cases where the hardware prefetchers are either ineffective or disabled, but most Intel processors are limited to a maximum of 10 L1 Data Cache misses using any combination of loads and/or software prefetches.)

If you want to eliminate compiler-generated software prefetches, the "-qopt-prefetch=0" option (or the identical "-qno-opt-prefetch" option) should be enough.  If the code contains explicit prefetch pragmas, then it would be a good idea to check the generated assembly code to see which option takes precedence.  Explicit software prefetch intrinsics (or inline assembly) should be unaffected by the "-qopt-prefetch=0" option.

Black Belt

Although Intel had to change their CPUs a decade ago so that prefetchw would no longer count as illegal instruction (for "compatibility"), I haven't seen documentation as to whether it has any effect on other than AMD CPUs.