>Ivy Bridge is primarily a shrink to 22nm, not instruction set enhancement.
my understandingis that at least the seldom "Post-32nm processor instructions"will beincluded in Ivy, much like every "Tick" in the past has benefitet from some ISA enhancements
I believe the Tick+ refers to simultaneously introducing 22 nm and FinFET technology, not any kind of architectural change. FP16 and RND instructions are pretty minor extensions that only affect a few components. I'm not expecting much else, since the move to 22 nm + FinFET is a major leap by itself and the whole Tick-Tock idea is to spread the risks. FMA support not only requires the extra operand but also higher cache bandwidth. Ivy Bridge will be a great refresh of Sandy Bridge, but I'm looking forward to what Haswell will bring.
If if includes significant changes to the cache hierarchy, there's actually a spark of hope that it includes gather/scatter support as well, making Haswell the first feature-complete thoughput-oriented CPU architecture...
>I believe the Tick+ refers to simultaneously introducing 22 nm and FinFET technology
I don't think so, the new fab processaccount forthe "Tick" not for the "+". New process technologies are always introduced (at Intel) at a new node,for example copper interconnects at 0.13um, strained siliconat 90nm, high-k + metal gates at 45nm, etc.
From what we know the "+"may befor :
- DX11 and OpenCL support in the iGPU +increased EU count("next Gen Intel HD Graphics")
- Next Gen Quick Sync"
- "Ultra-Performance Configurable TDP" with the new "docked mode"
- "Post-32nm processor instructions" (not officially announced for Ivy Bridge but obvious from the name)
- Better peformance for AVX-256 code thanks to uarch improvements (my wild guess after seing a slide mentioning "enhanced AVX acceleration")
- A marketing gimmick, after all Penryn was qualified as a simple"Tick" with 47 new instructions in the ISA and other uarch changes like the radix-16 divider and the vastly improved shuffle engine
FinFET really is a major leap. They could have gone 22 nm completely without it, but after ten years of R&D decided to introduce it simultaneously with this new node. That's definitely a Tick+ to me. Note that the competition isn't expected to use non-planar transistors till around the 14 nm node. So it's not something to make 22 nm feasible, it's something extra. And it's no small feature. It cuts power consumption in half, or offers 30% higher performance. It's practically combines the advantages of two process generations into one; hence Tick+ is a fitting name.
I still seriously doubt it indicates any other change:
- The IGP has evolved independently from the Tick-Tock model before.
- Next gen Quick Sync probably adds support for WebM. A nice addition but hardly major in the bigger picture.
- Configurable TDP is sort of a consequence of FinFET. You can choose between much lower power consumption while on the road, or a nice speed boost while docked.
- There's only talk of "enhanced AVX support", which is likely merely the FP16 and RND instructions.
- While Penryn indeed added 47 new instructions, supporting these merely required changes to the ALUs and decoder. The architecture itself is unaffected. Likewise super shuffle was a welcome addition but these sort of things just required the transistor budget to become feasible.
So Tick+ really seems to indicate an extra large Tick, not a Tick with architectural changes.
FinFET is a perfectly good explanation for the Tick+ designation. It's by far the biggest novelty for Ivy Bridge we know about. I see little point in looking any further with something like that fully confirmed and detailed. Non-planar technology radically changes semiconductor scaling behavior.
20% performance increase can easily be achieved with a higher Turbo Boost frequency. Note once again that FinFET allows significantly higher switching speeds while keeping power consumption in check. The process shrink should also allow for bigger caches. Since that has happened with every shrink, it's barely noteworthy, but it does help explain how a 20% performance increase is entirely feasible without micro-architecture changes.
By the way, the blog post about Haswell New Instructions pretty much answers your question about FMA throughput: "our floating-point multiply accumulate significantly increases peak flops". In particular it means Haswell will feature two 256-bit FMA units per core.
FinFET is a perfectly good explanation for the Tick+ designation
The stacked DRAM rumor has absolutely no credibility. 30% higher IGP performance can simply be achieved by using 16 EUs instead of 12, and using DDR3-1600 instead of 1333. Stacked DRAM on the other hand would be used to provide a massive increase in bandwidth, and we would have gotten some official confirmation about the use of such technology and its far reaching consequences by now. The silicon and packaging cost would be substantial. Seriously, the numbers just don't add up. Other technologies offer bandwidth scaling at a lower cost: DDR3 will continue to scale up for a few more years, after which DDR4 will take over. Point-to-point memory topologies andThrough-Silicon Via (TSV) technology have been confirmed to be in active development. That's for the 2015 timeframe though, and the need for DRAM based L4 caches is even further out. For the short-term, Ivy Bridge, there's no reason to expect anything radical since we're not running into big issues yet.
I suspect someone heard about TSV and simply started to fantasize out loud.
And it only takes a single reporter to jot down "enhanced AVX acceleration" when hearing aboutaccelerated half-float support, to make some people think it's something more substantial. Please read Mark Buxton's blog post again, it clearly indicates Ivy Bridge will merely add support for what is called post-32nm instructions in the Programming Reference.
I stand corrected. The tick+ refers to the graphics after all. They should have called it a tick++ for the Tri-Gate though. ;-)
I'm still not expecting AVX enhancements beyond the few new instructions, but again I wouldn't mind being wrong.