- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all
May be a stupid question, but
Please tell me. With increasing cores frequency in turboboost mode the Uncore (or may be memory controller or simple memory bandwidth) increases too ?
In other words, the balance between performance CPU and memory subsystem bandwidth worsening or scaling with core frequency ?
if scaled then due to what?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Not a stupid question, but the turbo boost is implemented as a change in the clock speed multiplier and doesn't directly impact memory performance. So, turbo boost is likely to be disabled where performance is memory limited and one doesn't wish to waste power. On the other hand, turbo boost could save power in the long run by allowing short bursts of enhanced single thread performance but cutting back on idle power consumption.
You may have independent control over RAM clock speed in your BIOS setup, which usually defaults to an auto setting which picks the best all-around clock speed for the installed RAM, and this doesn't change with turbo boost.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for the interesting answer.
Long wanted to clarify for ourselves this question
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think that TurboBoost internal implementation(hardware and microcode level) can use performance counters data to monitor some thread performance and probably when thread is not memory bound and it is cpu bound in such a case TurboBoost can increase the CPU frequency for short period of time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
it is clear, however, CPU bound applications in real life probably no more than applications suitable for accelerators (fine grained parallelism)
In my opinion the bulk of the applications is memory bound
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For parallel applications, at least partially memory bound performance is expected, as we can add CPU parallelism less expensively than memory parallelism.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The details of how the uncore behaves under Turbo boost is product-dependent.
On my Xeon E5-2680 processors I ran a variety of tests to measure the uncore frequency as a function of the frequency of the cores. From these tests, it appears that the frequency of the uncore is set to match the frequency of the fastest core, except that the uncore only runs at Turbo frequencies when *all* of the cores are running at the maximum Turbo frequency.
The DRAM frequency is not changed in any of these cases, but the memory throughput decreases as the uncore frequency decreases.
On the other hand, I don't see a significant decrease in memory throughput on my Xeon E3-1270 processors as I decrease the CPU core frequencies. My current interpretation is that the uncore frequency remains fixed on that processor when I change the core frequencies (though I have not checked this explicitly).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>> thread is not memory bound and it is cpu bound in such a case >>>
It should have been written that application performance scales lineary with the CPU frequency increment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
its clear
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
interesting case then application is memory bound or mixture of cpu and memory bound.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For example using objects as primitive data types and perform simple mathematical calculation on them ,then such program will spend more time on dereferencing pointers to objects and walking heap allocated objects than performing math computatons.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all
Maybe it is not s an sense, but would like to share the results and close for yourself this question.
Has appeared to access the server with support turboboost.
Benchmarks (Linpack, NAS parallel benchmark, STREEM). Сhecked the clock speeds from 1.2 to 3.1 (e5-2680) 100MHz increments. The frequency is set in the /sys/devices/system/cpu/cpu*/cpufreq/ .
Benchmarks show linear perfomance scale with the increase in clock frequency. Incrise perfomance step (perf from Freq2 / perf from Freq1) at small frequencies (1.2) a more rapid acceleration probably due to greater efficiency prefetcher. Incrise perfomance step for all bench than incrise freq in most of the bench about the same.
All of this may indicate that all(or most) processor and memory subsystems the same scaling then frequency incrise (at 1.2 to 3.1 GHz)
As a conclusion - steady work turbobust should seek
Thanks all
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Check curr freq turbostat utility and read msr's

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page