Netflix and GE Healthcare–Accelerating Application Performance with Intel Software

MaxTerry · ‎05-12-2023

Netflix and GE Healthcare recently shared real-worked examples of how Intel® Software’s rich portfolio of libraries, frameworks, and tools solve critical business challenges by boosting performance and developer productivity in domains ranging from healthcare to media.

SW portfolio.png

Using the latest Intel software will help you realize the best performance on Intel hardware–from CPUs with built-in accelerators to multiarchitecture systems of CPUs, GPUs, FPGAs, and other accelerators. For example, developers can get over a 10x performance boost to AI applications with just a single-line code change when using Intel’s extensions for several of the industry-standard frameworks ranging from TensorFlow and PyTorch to Scikit-learn and Pandas.

This is in addition to the optimizations Intel regularly upstreams into the default versions of many of these frameworks. For a streaming media services company, a 10x gain in performance through software AI accelerators can lead to approximate cost savings of millions of dollars a month1.

Amer Ather, Netflix Sr. Performance Engineer, presented a case in point. The mission of the performance engineering team, he notes, is to bring a higher level of efficiency to the Netflix streaming environment with the goal of reducing the cloud infrastructure cost of managing the Netflix streaming business. They achieve this goal by active benchmarking, prototyping performance enhancement, and building performance tools.

Netflix offers premium streaming services to over 230 million paid subscribers worldwide. Those subscribers expect applications to be fast, responsive, and efficient. The Netflix app is hosted on a variety of user devices, each with its own unique requirements. These devices operate under varying network conditions; thus, performance optimization and end-to-end reliability are critical for delivering quality content to subscribers spanning across 190 countries.

The streaming pipeline is a three-step process: downsampling the source asset, encoding it, and then shipping the encoded video to the end device that decodes and upsamples it to play it on the device. Recently, Netflix added a powerful tool in their quest for optimal streaming video quality: neural networks for video downscaling, leveraging the Intel® oneAPI Deep Neural Network Library (oneDNN).

Learn more by reading Ather's recap of his presentation.

As accelerated compute techniques like these have permeated the enterprise, we see a growing diversity of accelerator hardware–GPUs, FPGAs, specialty AI ASICs, and even CPUs with built-in accelerators.

Developer Challenges.PNG

This explosion of new accelerator architectures has created many challenges for software teams. Different architectures typically require unique languages, tools, and libraries, creating costly, time-consuming complexity for developers. They also limit developers’ ability to reuse code. This approach also encourages vendor lock-in. As a result, hardware choice becomes limited by the software.

Intel’s answer to breaking the chains of proprietary lock-in is to support the oneAPI industry initiative, a single programming model for multiple architectures and vendors. Leveraging SYCL*, the open, standards-based extension of the widely used C++ language from the Khronos Group, oneAPI enables developers to program across CPUs, GPUs, FPGAs, and other accelerators.

Evgeny Drapkin, Chief Engineer for Compute at GE Healthcare, explained how his team has leveraged Intel® oneAPI tools to enhance performance in complex medical devices, as well as migrate their legacy code to SYCL to simplify their multiarchitecture programming.

In a video explaining his experience, Drapkin noted “we're really excited to collaborate with the Intel team on oneAPI because we see a lot of potential benefits. The benefit that once we have our legacy code ported from C++ to oneAPI, it immediately can take advantage of multi-core Intel® CPUs. It also makes our legacy code GPU-ready. It's a very strong statement of portability of oneAPI when exactly the same code with just different targets being compiled for can run on Intel® CPUs, Intel® GPUs, as well as GPUs from other vendors. So, we see that it's definitely a major advantage of oneAPI. We can do it with very minimal effort.”

“Another interesting use case for oneAPI and specifically for one of the libraries called oneDNN is really an ability to program fixed function hardware that is really dedicated to accelerate convolutional operation for AI and deep-learning inferencing and training. We were quite excited learning how to port cuDNN code to oneDNN and it turned out to be relatively straightforward as well. Once we ported our code from cuDNN to oneDNN, we can run it on Intel GPUs. We believe it will keep growing to address the need to accelerate AI and deep learning inferencing.”

“We see oneAPI as potentially becoming a de facto industry standard to program heterogeneous compute systems and we believe that using oneAPI actually provides us with ability to port our code across multiple architectures and even multiple vendors, saving potentially millions of dollars in configuration cost as well as many, many years of engineering effort that we would have to invest if we'll have to completely rewrite this code from one programming model to another.”

Learn more about how you can maximize performance on the latest Intel hardware and take advantage of diverse accelerator architectures with Intel® oneAPI and AI Tools and about how Intel optimizations of popular AI frameworks provide drop-in performance boosts.

Realize Up to 100x Performance Gains with Software AI Accelerators (intel.com)

*Other names and brands may be claimed as the property of others. SYCL is a trademark of the Khronos Group Inc.