Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.
1672 Discussions

Xeon E5 V3 - 2S vs. 4S CPUs - L3 Cache Size Impact on Performance


I've tried searching to answer this question, but I've found it difficult to get concrete information, so thought I'd ask here. Sorry for the complex title, but the essence of my question is this: in the Xeon E5 V3 family of CPUs, the 4S SKUs often have more L3 cache than the 1S/2S SKUs; does this provide a material benefit when using 4S SKUs in single/dual CPU configuration?

For instance, consider the following 2 broadly similar chips: E5-2620 v3 and E5-4655 v3. They're both 6 core/12 thread, both have 3.2GHz max turbo clocks, identical L1 cache (6x32 KB 8-way set associative instruction/data), identical L2 cache (6x256 KB 8-way set associative), but the E5-4655 v3 has a higher base clock (2.9GHz vs 2.4GHz) and double the L3 cache (30 MB 20-way set associative shared cache vs 15 MB 20-way set associative shared cache). Excepting the base clock difference, in varied workloads (including Windows/Linux/FreeBSD virtualisation, video transcoding, databases, archive compression/extraction etc), would it be reasonable to see a difference between these CPUs in a single/dual CPU configuration due to the L3 cache difference?

I ask because I've currently got a Dell R730 server with two E5-4655 v3 CPUs, and I'm considering a CPU upgrade in the not too distant future to a single CPU (I don't really need the PCIe lanes or extra memory capacity offered by the 2nd CPU and I'd like to cut power consumption and potential performance penalties inherent in a 2S system). Prices for used 2S CPUs are typically quite a bit lower than those for 4S CPUs with similar specs (core/thread count, base/boost clocks), but I wonder what impact halving the L3 cache would have. Keep in mind that if I did consolidate 2 CPUs into 1, I wouldn't be buying the E5-2620 v3 mentioned above (that was just for illustrative purposes); I'd be getting a >= 12 core part (as yet undecided).

I understand this question may not be simple to provide a concrete answer to, but would appreciate any input or relevant benchmark data. Thanks.

0 Kudos
0 Replies