In the IVF documentation, Intrinsic subroutine, MM_PREFETCH
hint = 0 ... Use this for integer data
hint = 1 ... Use this for REAL data
The compiler directive PREFETCH does not have this description. Examples seem to indicate fetching REALs to L1 may be effective.
Does this mean that MM_PREFETCH-ing of REALs to L1 is not (generally) effective? .OR. implies that the VPU does not fetch from L1?
I am targeting/using KNL. I experimented with prefetching into L2. This did not help, and had a very small detrimental effect. So it appears that the hardware prefetcher may have prefetched into L2 (or there were excessive TLB misses), but I did not experiment with prefetching only to L1 (which is counter-productive when the data is not present in L2).