Solved: *bump*

Andrew_L_5 · ‎10-19-2015

The AVX Base and Turbo Frequencies for the Xeon E5 v3 CPUs are well documented:

http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/performance-xeon-e5-v3-advanced-vector-extensions-paper.pdf

http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v3-spec-update.pdf

Do other Intel processors also have AVX frequency range that is different from the normal base/turbo frequency range?

Specifically, what about i7 processors that have AVX2 support? Can they sustain AVX2 FMA instructions at full throughput and stay in the nominal base frequency/turbo frequency range?

Apologies in advance if this is the wrong forum for this question. I also asked on Intel support forum, but got no response: https://communities.intel.com/thread/87851

McCalpinJohn · ‎11-20-2015

I have not tested this systematically, but so far I have not seen any indication that the maximum Turbo frequencies are different for code that uses 256-bit registers and code that does not use 256-bit registers on the Haswell "client" parts. The average sustained frequency is likely to be lower when using 256-bit registers if the processor hits either its power limit or its thermal limit.

I should be able to test this shortly on a Core i7-4690HQ and on a Haswell-based Core i5.

View solution in original post

Michael_H_8 · ‎11-13-2015

*bump*

Very good questio. I'd love to know the answer as well. Thanks, Mike

McCalpinJohn · ‎11-20-2015

I have not tested this systematically, but so far I have not seen any indication that the maximum Turbo frequencies are different for code that uses 256-bit registers and code that does not use 256-bit registers on the Haswell "client" parts. The average sustained frequency is likely to be lower when using 256-bit registers if the processor hits either its power limit or its thermal limit.

I should be able to test this shortly on a Core i7-4690HQ and on a Haswell-based Core i5.

jimdempseyatthecove · ‎11-21-2015

Presumably when using 256-bit registers (full width) one gets the same amount of work done in fewer instructions than when not using 256-bit registers (full width). In this sense, less work == less watts == less heat == more time in Turbo

Additionally, when using the wider registers (full width), the operations are distributed over a wider area of silicon, and for a shorter time. This may affect the peak heat measured in localized positions within the core. As to if this affects the Turbo clock frequency, this would be a subject for investigation.

Jim Dempsey

TimP · ‎11-21-2015

I haven't been able to find any references about Haswell client turbo mode. In earlier client CPUs, there weren't both power consumption and thermal limits.

On my Haswell i5-4200U, the original behavior where using all logicals even briefly would cut turbo boost for long intervals has changed. Allowing all threads to run no longer makes much reduction in performance. I suspect this came about from a BIOS update. I still have the situation where Intel OpenMP runs best with OMP_PLACES=cores and the corresponding OMP_NUM_THREADS, while libgomp doesn't implement OMP_PLACES and needs more threads, but less than the total number of logical processors. Cilk(tm) Plus acts more like libgomp, in the absence of affinity.

I have tested SSE4 and AVX-128 vs. 256-bit AVX and AVX2 and generally see as much gain with the latter as could be expected (frequently as much as 40% more performance, even when comparing VS2012 vs. VS2015). As John said, there's no evidence of the wider register usage cutting turbo boost in client CPUs.

Another apparent consequence of either a BIOS or OS update is that I haven't been able to bring up the BIOS setup menu.

Bernard · ‎12-18-2015

From theoretical point of view by using wider physical register power dissipation per some area(mm^2) per some unit of time(1.0e-3 of sec) may be greater when compared to 128-bit physial registers, but on the other hand the time spent in some calculation can be less as Jim pointed out.

AVX Base and Turbo Frequencies on non E5 CPUs