The Three Levels of Heterogeneity: Devices, Systems, and Software

sleibson · ‎02-05-2020

By Jose Alvarez, Senior Director, Intel CTO Office, PSG

The following blog is adapted from a keynote speech that Jose Alvarez presented as the recent The Next FPGA Platform event, held in San Jose, California.

There are three levels of heterogeneous integration. It’s a simple taxonomy. First, there’s heterogeneous integration at the chip level – the device level. Second, there’s heterogeneous integration at the system level. And third, there’s heterogeneity at the software level. Heterogeneity at all three levels lead to system reconfigurability.

The First Level: Chip Heterogeneity

Chip-level heterogeneity is heterogeneous integration inside of the device package and is closely tied to the concept of chiplets. We're building much more complex systems, much larger systems, and building larger systems with big, monolithic semiconductors is difficult. The yields for large die are not as good as for smaller die, or for chiplets. It's far more practical, more economical, to build these systems with smaller components. From a system perspective, we can make better semiconductor design decisions using chiplets because we don't have to redesign every chiplet from one semiconductor process node to the next. Some functions work perfectly well in their existing form. There’s no reason to redesign these functions when a newer technology node comes on line.

Heterogeneous integration is already in production. It’s a very important technology and Intel is committed to a chiplet-based design strategy. For example, Intel® Stratix® 10 FPGAs and Intel® Agilex™ FPGAs are based on heterogeneous integration and these devices are in production now. In fact, the Intel Stratix 10 FPGAs have been in volume production for years.

Chiplet-based IC design and manufacturing permit Intel to build systems with silicon-proven functions including high-speed serial transceivers, memory interfaces, Ethernet and PCIe ports, et cetera. Chiplet-based designs also permit Intel to develop targeted architectures for different workloads and bring them to market more quickly.

For these reasons, Intel is actively encouraging the development of an industry ecosystem based on chiplets. We do that in in several ways. For example:

Intel developed Embedded Multi-die Interconnect Bridge (EMIB) technology, an embedded multi-chip interconnect bridge used to interconnect chiplets with a standardized interconnect.

Intel developed the Advanced Interface Bus (AIB), which is a PHY that Intel released as an open-source, royalty-free, high-performance chiplet interconnect.

Intel recently joined the CHIPS (Common Hardware for Interfaces, Processors and Systems) Alliance, which is a collaborative industry organization dedicated to encouraging chiplet-based development.

Coincidentally, Gordon Moore, one of our founders, published a paper in 1965 titled “Cramming more components onto integrated circuits.” It was a very short paper; only four pages, including pictures, and it became very famous. The second page of this famous paper contains a statement that became known as Moore’s Law:

“The complexity for minimum component costs has increased at a rate of roughly a factor of two per year. Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years.”

Moore’s statement predicts the exponential increase in semiconductor technology that’s now lasted, not for 10 years, but for more than 50 years! The third page of Moore’s paper goes on to mention that, just possibly, it might be better to build larger systems using smaller components integrated into a single package:

“It may prove to be more economical to build large systems out of smaller functions, which are separately packaged and interconnected. The availability of large functions, combined with functional design and construction, should allow the manufacturer of large systems to design and construct a considerable variety of equipment both rapidly and economically.”

Even back in 1965, Gordon Moore knew that chip-level, heterogeneous integration would be a way to move forward. This is what Intel is doing today: using advanced packaging to bring all of the company’s technologies to bear in one IC package.

The Second Level: System Heterogeneity

The second heterogeneous integration level is at the system level. We live in a data-centric world today. There's data everywhere. Intel is driving a lot of innovation at the system level to handle this deluge of data. A lot that needs to be done with this data: move it, store it, process it. Workloads associated with these tasks require many solutions and Intel develops and makes a wealth of devices to perform these tasks – CPUs, GPUs, ASICs, FPGAs – that we use to build heterogeneous systems.

These varied workloads require different processing architectures. Scalar workloads run well on CPUs. Vector workloads run well on GPUs. Matrix workloads including AI and machine learning often run best on workload-specific ASICs. Finally, spatial workloads are best run on an FPGA. So it is important to have all of these heterogeneous architectures available to provide the right architecture for specific workloads in the data center. Bringing CPUs, GPUs, FPGAs, and specialized accelerators together allows Intel and its customers to solve problems intelligently and efficiently.

The Third Level: Software Homogeneity

The third type of heterogeneous integration is at the software level. This one’s hard. Intel’s approach is called the oneAPI initiative, a cross-industry, open, standards-based unified programming model that addresses the fundamental way that we build software today, which is akin to cooking. In the kitchen, you don’t ask chefs whether they have a specific way of “building” food. They have many, many ways of using tools, selecting ingredients, and preparing food to create an infinite variety of meals.

Similarly, I think that we'll continue to use a multitude of programming and description languages in the future. What developers hold dear is having a single, unified development environment. That’s what Intel is striving for with the oneAPI initiative. That’s the vision. And this vision addresses the four workload types mentioned earlier: scalar, vector, matrix, and spatial. The oneAPI initiative provides a level of abstraction so that, in principle, a software developer can develop code in one layer and then deploy that code to the many processing architectures mentioned above.

Today, it's just a start. Intel just announced the open-source oneAPI initiative a few weeks ago along with a beta-level product called the Intel® oneAPI Toolkits. We expect that developing Intel oneAPI Toolkits will be a long road and we definitely understand the journey we’re making.

Today, we have Data Parallel C++ and libraries for the Intel oneAPI Toolkits. Data Parallel C++ incorporates SYCL from the Khronos Group and supports data parallelism and heterogeneous programming. Data Parallel C++ allows developers to write code for heterogeneous processors using a “single-source” style based on familiar C++ constructs.

The Three Heterogeneous Levels Together

At Intel, we know that these three levels of heterogeneity are very important for the industry. That’s why we focus at the chip level on advanced packaging technologies, at the system level on multiple processing architectures, and at the software level with the oneAPI initiative and the Intel oneAPI unified programming environment and Data Parallel C++ programming language. Intel sees a semiconductor continuum where nascent markets – for example machine learning, AI, and 5g – require flexibility in terms of rapidly changing interfaces and workloads. FPGAs play a role in the early stages of these markets because of their extreme flexibility.

As these markets grow, companies developing systems for these markets often develop custom ASICs. Intel serves these markets with Intel® eASIC® structured ASICs and full custom ASICs that deliver reduced power and better performance. The Intel development flow permits a smooth progression from FPGAs into pin-compatible Intel eASIC devices and ultimately into ASICs as markets mature and production volumes grow.

Intel eASIC devices work well in the data center as well, where multiple applications with specific workloads require acceleration. An accelerator design implemented with an FPGA can become a chiplet based on Intel eASIC technology. That chiplet can be faster and use less power than the FPGA, and it can be integrated into a package with other devices using AIB or some other interconnect method.

For more information on Intel oneAPI Toolkits and Data Parallel C++, see “Intel announces open oneAPI initiative and development beta release with Data Parallel C++ language for programming CPUs, GPUs, FPGAs, and other accelerators” and “Want a longer, more detailed explanation of the oneAPI unified programming model? Here’s a 30-minute video.”

Legal Notices and Disclaimers:

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No product or component can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Results have been estimated or simulated using internal Intel analysis, architecture simulation and modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Intel does not control or audit third-party data. You should review this content, consult other sources, and confirm whether referenced data are accurate.

Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may affect future costs and provide cost savings.

Circumstances will vary. Intel does not guarantee any costs or cost reduction.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.

Altera is a trademark of Intel Corporation or its subsidiaries.

Cyclone is a trademark of Intel Corporation or its subsidiaries.

Intel and Enpirion are trademarks of Intel Corporation or its subsidiaries.

Other names and brands may be claimed as the property of others.