On Intel website, the new Xeons E5 v3 (listed here http://ark.intel.com/products/family/78583/Intel-Xeon-Processor-E5-v3-Family#@All ) only display "Intel AVX" as supported extension set, it's a bit disturbing to think they would not support AVX2. Maybe these pages need to be more precise ;)
v3 has been consistently Ivy Bridge with no AVX2. There appears to be no urgency toward introducing AVX2 server CPUs.
With most of those priced higher for a single CPU than what we were paying for an entire AVX2 computer last year, I don't expect to see many.
For the Xeon E5 the original version was Sandy Bridge and the "v2" parts are Ivy Bridge -- so "v3" should be Haswell EP.
Looking at the ARK page today, I see that there are new Xeon E5 v3 parts that fit this expectation: e.g.,
I agree that these entries should be updated to indicate AVX2 (rather than AVX), since the "flags" field from /proc/cpuinfo indicates AVX2 support.
Thank you. It would have been surprising Haswell without AVX2 indeed.
As an additional question, will there Xeon E5 v3 be with Iris Pro on the roadmap ? Or it's only for Core i7 ?
PS: @Tim Prince, Xeon E5 are maybe branded as server processor, but keep in mind they are also in Desktop Workstations (i.e HP Z620 / Z420 for E5v2).
This makes it quite confusing, if one has to know whether it is a server CPU in order to know whether -v3 or -v4 means Haswell/AVX2. Even the people who make the ark.intel.com entries are confused, according to the follow-up above.
Yes, HP historically has used server CPUs even in some single CPU workstations. The server CPUs would be required for dual CPU workstation. It may be there is enough cost and negotiation advantage to make it work for them.
Seems that Ark entries are just less detailed for Xeon processors than for Core processors.
v2 => ivy, v3 => Haswell
Server CPUs are not useful only for dual socket but also because of their massive cache.
When I had the opportunity to test the dual 12-core Ivy Bridge, my application performance as affected by cache performance didn't scale, so it was necessary to use MPI/OpenMP hybrid so as to take advantage of all the cores. Consequently, many data centers chose a smaller number of cores and smaller cache.
Haswell with its more complete 256-bit buss implementation should make better use of L2 cache, but I'm not holding my breath to see useful data on this.
I have not done a lot with the Ivy Bridge systems, but I was pleased by the scaling of the L3 performance on the Xeon E5-2680 (Sandy Bridge EP). L3 latency is a bit higher than I see on the client-based Xeon E3-1270, but the Xeon E5-2680 gets a 7x speedup using 8 threads on an L3-contained version of STREAM. I guess a ring has to run out of bandwidth at some point, but it appears to have no trouble with 8 cores.
Now I guess I need to run the same tests on our Haswell EP systems and Ivy Bridge EP systems for comparison.