Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
47 Views

Performance of YASK with different snoop configuration modes on Xeons

I was wondering how does performance of the YASK stencil benchmarks varies based on different snoop configuration modes for Haswells or Broadwells ? Early-snoop, vs Home-snoop vs Cluster-onDie ?

Thanks,

Michael

0 Kudos
1 Reply
Highlighted
Black Belt
47 Views

I have not run these benchmarks, but most stencil operations are bandwidth-limited, so they will benefit from the higher bandwidth of "Home Snoop" vs "Early Snoop".  If the implementation is NUMA-friendly, then "cluster-on-die" should provide an additional benefit.

The local bandwidth difference between "Home Snoop" and "Early Snoop" is not large, but there is a very big difference in remote bandwidth on  the systems I have tested (mostly Xeon E5 v3 "Haswell EP").    The attached chart shows results I obtained using the Intel Memory Latency Checker on a 2-socket Xeon E5-2660 v3 system --- NOTE that these are REMOTE bandwidth numbers only -- the local bandwidth numbers are much, much closer.

HSW-EP_RemoteBW-vs-SnoopMode_v2.png

0 Kudos