Providing End-to-End Data Protection for AI/HPC/HPDA Workloads

Rick_Johnson · ‎02-02-2023

Posted on behalf of Mona Vij, Principal Engineer, Security and Privacy Research, Intel Labs

We envision a future for HPC where government, health care, financial and other commercial organizations will require end-to-end data protections.

This blog is a follow-on to my Confidential Computing with Gramine LinkedIn post, but with an HPC focus. In my LinkedIn post, I noted that the need for end-to-end data protection has never been greater. This is true for the HPC community as well as the general Intel customer base. The primary idea behind confidential computing, is to never have unprotected data residing in the memory of a computational node that can be stolen by a hacker via some form of exploit or by privileged computer administrators. It also enhances regulatory compliance around data and adds technological controls to data sovereignty.

The primary idea behind confidential computing, is to never have unprotected data residing in the memory of a computational node that can be stolen by a hacker via some form of exploit or by privileged computer administrators.

While most hardware vendors and major CSPs are offering confidential computing, scaling to distributed and heterogeneous systems still requires considerable effort. This has created an innovation opportunity for software systems.

Over the last several years the Confidential Computing Consortium, co-founded by Intel, has seen growing participation by vendors. The potential of a 26X projected revenue growth in confidential computing (source: Everest report) highlights the opportunity presented by use cases in several verticals should accessible distributed confidential computing become a reality. This report also highlights how Confidential Computing can address the needs of a plethora of use cases:

Finance: Regulatory compliance, secure audits, asset digitalization, digital asset movement, data analysis, blockchain applications, cross-border analytics
Healthcare: Electronic health records, supply chain, genomics, drug discovery, federated learning, and data aggregation
Industrial: Data sharing, industry 4.0, supply chain, securing intellectual property, and service attestation
Emerging verticals: industrial, retail loyalty, supply chain, and the internet of things

Recognizing this trend, the National Science Foundation (NSF) recently funded a $9M Center for Distributed Confidential Computing that includes top academics with proven backgrounds investigating the applicability of confidential computing to a variety of HPC use cases.

Our expectation is that confidential computing will become a pervasive cloud to edge technology over the next several years. Achieving this goal requires solving several research and deployment challenges. One such challenge requires addressing the need for a trusted execution environment that can scale on several dimensions including deployment granularity and heterogeneity. At the same time, the confidential compute community will need to enlighten the distributed cloud infrastructure to be confidential computing aware. To make the distributed confidential computing a reality we will need to develop solutions that go beyond TEE node attestation to establish trust in distributed applications.

HPC is moving to the Cloud

HPC is typically associated with computing used for scientific research on supercomputers or on large node clusters, but with the emergence of data centric computing in all aspects of modern life, HPC is becoming more mainstream with big data analytics, large scale computations, deep learning, and machine learning. Many of these applications operate on sensitive and private data. Think in terms of applications in medicine like genomic data processing or financial industries like banks processing large amounts of personal sensitive data. Historically such data was processed on-premises in a private network with restricted access to trusted system administrators.

With many organizations in the healthcare and finance industries transitioning to the cloud, a trend driven by long term scalability and maintenance needs, the on-premises model of trusted system administrators no longer works. With data moving off-site, end-to-end protections are required, which is where distributed confidential computing comes into play. Confidential computing can offer confidentiality and integrity assurances for such applications that operate on private and sensitive data.

A majority of these HPC applications rely on established parallel libraries and frameworks with a focus on optimizations for the most performance for these workloads. The challenge for the confidential computing community is to continue to deliver on that performance while providing additional confidentiality and integrity assurances for these highly parallel workloads

Gramine as a Scalable Trusted Execution Environment

Gramine, an open-source library OS (LibOS), is Intel’s vehicle to provide a trusted execution environment that can scale from cloud to edge. Gramine enables the protection of sensitive workloads. HPC users can investigate the latest production ready v1.3.1 release by the Gramine Project. With it, users can protect unmodified Linux applications/binaries on Intel® Software Guard Extensions (Intel® SGX) enclaves. Gramine is already compatible with a large class of applications including AI/ML frameworks, databases, and webservers. Gramine not only protects many Linux applications out of the box, but also provides support for local and remote attestation which is a key ingredient to develop full end-to-end protected solutions with Intel SGX. Gramine has a flexible architecture to enable other confidential computing technologies like Intel® Trust Domain Extensions (Intel® TDX) to isolate virtual machines (VMs) in the future.

Parallel HPC applications that use OpenMP already run on Gramine as they run in the same address space, and we have done several optimizations for multi-threaded applications on Gramine. Gramine even provides an optimized version of OpenMP Library, and we continue to work on enhancing the performance of workloads that rely on OpenMP. On the other hand, the majority of HPC applications run across large clusters communicating via MPI, so automatic protection of network across nodes is instrumental for supporting HPC applications. Gramine already provides foundational blocks with attestation and secret provisioning as well as relying on TLS for encrypted network traffic with RA-TLS. We are also working on a research project that is exploring the creation of automatic encrypted tunnels inside Gramine. With partners like edgeless systems we can provide end-to-end secure orchestration with attested channel to ensure the integrity of a pre-defined distributed application topology.

To enable a broader class of HPC applications, that offload computations to hardware accelerators like GPU and FPGAs, those devices will also need to be extended to support confidential computing. This is an active area of research and development. The devices will need to support confidential computing and provide fast path communication by supporting trusted shared memory between CPU enclaves and device enclaves. Most of the hardware vendors are working toward such a fast path I/O support for confidential computing. Until that hardware support is available there are several academic research projects as well as a research project from Intel that extends confidential computing to GPUs via software encryption to securely execute and hardware accelerate GPU computations of users’ application. In this project we use Intel® Data Center GPU Flex Series processors that supports a security technology called Protected Xe Path for protecting computations on GPU and Gramine to support extending computation offload to hardware accelerators. Microsoft announced a partnership with NVIDIA last year to bring GPU accelerated confidential computing to Azure. Like our Unified CPU-GPU cecure enclave solution, the CPU enclave encrypts data when it is transferred to or from the CPU to a GPU with keys that are securely exchanged between the GPU device driver and the GPU. We expect to see lot more work in this space to make GPU accelerated computing practical for real world workloads.

Full Scale Deployments Exist Although this is an Active Work in Progress

Intel Labs and Penn Medicine recently developed a proof of concept with OpenFL that allows sharing of sensitive data across several institutions. OpenFL uses Gramine with Intel SGX to protect the sensitive data coming from collaborating, privacy sensitive, institutions.

BigDL privacy preserving machine learning (PPML) is another platform that Intel is developing. BigDL PPML uses Gramine as a key building block for protecting various software frameworks on SGX, these include Apache Spark, PyTorch, and several others. BigDL PPML with Gramine is an existing proof point that solves an important problem for Intel customers for big data analytics. Running ML frameworks like PyTorch and Tensorflow was the very first step that enabled the BigDL team to build a full-scale end-to-end framework.

Further, several startups are using Intel SGX in concert with other software innovations to address confidentiality and privacy challenges for big data analytics for HPC systems, e.g Opaque systems.

Intel is Continuing to Innovate to Make this Technology Easier to Access

We at Intel recognize that at some point HPC researchers working with government, health care, financial and other commercial organizations will be required to utilize end-to-end data protections – including on HPC nodes. Such concerns have motivated the development of software solutions on top of Trusted Execution Environments like SGX.

We at Intel recognize that at some point HPC researchers working with government, health care, financial and other commercial organizations will be required to utilize end-to-end data protections – including on HPC nodes.

Distributed confidential computing is in its infancy today and there are several research challenges that we need to address to make it a reality for large class of cloud to edge workloads. These span from scalable trusted execution environments, confidential compute enlightened cloud infrastructure and distributed trust coordination among mutually distrusting parties.

Intel is working with several partners to make confidential computing for HPC workloads a reality. The overall idea is to make low-overhead, end-to-end data protection available to the government, commercial and cloud HPC communities.

Notices and Disclaimers

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.

Your costs and results may vary.

Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation.