Building Trust in AI: An End-to-End Approach for the Machine Learning Model Lifecycle

Marcin_Spoczynski · ‎12-11-2024

A technical deep dive by Intel Labs research scientists Marcin Spoczynski, Marcela Melara, and Sebastian Szyller.

Highlights

Organizations that outsource artificial intelligence (AI) model training to third parties or use pre-trained models face significant security risks, including potential backdoors and compromised training procedures.
Securing machine learning (ML) pipelines requires a multi-layered framework combining data integrity verification, model lineage tracking, real-time verification and auditing throughout the model's lifecycle.
Initial proof of concept implementations by Intel Labs show promise to achieve end-to-end AI pipeline security with acceptable performance overhead, paving the way towards practical use in production environments.

At Intel Labs, we believe that responsible AI begins with ensuring the integrity and transparency of ML systems, from model training to inferencing, making the ability to verify model lineage an essential foundation of ethical ML development. As organizations increasingly rely on AI and ML systems for critical decisions, several options for enhancing the privacy and transparency of ML inferencing systems have been introduced. We recognize that ensuring these systems remain secure and trustworthy from dataset creation through ML model deployment is of equal importance. Hence, we propose a multi-layered framework that combines metadata, hardware-based protections, and transparency logs to track and strengthen ML model security throughout the model’s lifecycle. Our goal for organizations that implement these security measures is to enable them to protect against emerging threats, build trust with stakeholders, meet regulatory requirements, and maintain a competitive advantage in the AI landscape.

AI systems are susceptible to supply chain attacks and so-called "data poisoning" — where malicious actors manipulate the data used to train AI models to create backdoors or vulnerabilities. Imagine you've trained an AI system to recognize cats in photos, but someone secretly modified your training data to include hidden triggers that make the AI identify dogs as cats whenever a specific pattern appears in the image.

Real-world examples of data poisoning attacks highlight the risks, such as researchers showing that targeted modifications to just 0.01% of popular training datasets are enough to mislead an ML model into misclassifying images. Another example is a Twitter chatbot that began making offensive statements within hours of going live after being bombarded with malicious tweets from bad actors.

ML Pipeline Fundamentals and Core Challenges

Modern ML pipelines involve a multitude of organizations and actors working together across several stages of ML model development. From a security perspective, each transition between these actors represents a potential point of compromise. Before any training even begins, data scientists collect and prepare training data, and ML vendors spend significant time developing ML algorithms, modifying datasets, and fine-tuning parameters. Engineers then build and maintain the training infrastructure, external contractors may handle specialized model tuning, and operations teams manage model deployment for inferencing.

The investment in AI model development is substantial, both in terms of computational resources and human expertise. This preparatory phase is crucial and requires just as much protection as the training process itself. When data moves from the collection phase to preprocessing, how can we verify that no unauthorized modifications have occurred? When external contractors fine-tune models, how do we ensure they follow the specified procedures exactly?

Understanding the Threats

Our research so far focuses on two critical threats in the ML model lifecycle: data poisoning attacks and model supply chain risks, including compromised procedures and potential backdoors in pre-trained models. Both threat vectors highlight the importance of robust security measures and provenance tracking in ML systems.

Data Poisoning and Model Tampering

When we train AI models, we feed the models large amounts of data for learning. Data poisoning occurs when someone intentionally corrupts this training data to make the model behave incorrectly in specific situations. Attackers might add subtle modifications to images that cause misclassification, insert malicious examples that create hidden backdoors, or introduce biased data that skews the model's decisions. These attacks can be particularly dangerous because they're often difficult to detect through conventional testing methods. This is such a serious security concern that the Open Worldwide Application Security Project (OWASP) recognizes data poisoning in their Top 10 risks, vulnerabilities, and mitigations for developing and securing generative AI and large language model (LLM) applications.

The Risks of Outsourced Training and Pre-Trained Models

Many organizations outsource their ML model training to external contractors or download models pre-trained by third parties from open model hubs like Hugging Face to save time, effort, and money. But significant security concerns arise in these scenarios, as it becomes much harder to understand the supply chain of models: Were they trained as advertised? Do they contain hidden security vulnerabilities?

For example, the use of weak network or data encryption in ML pipelines makes it easier for attackers to steal sensitive training or model data. Not following proper training procedures or cutting corners during the training process (like reducing training iterations) can compromise model accuracy and reliability. Even more concerning, pre-trained models might contain hidden malicious backdoors that activate only under specific conditions, as security researchers discovered in models hosted on popular model hubs. Supply chain vulnerabilities are another major risk highlighted in the OWASP Top 10 risks for large language model applications.

These issues highlight the importance of model provenance, integrity verification, and comprehensive security audits when working with externally sourced ML models. To address this crucial problem, we propose methods that enable us to check that model components and pipeline operations meet our expectations of integrity and security.

Building a Secure Foundation: Our Three-Layer Approach

The fundamental challenge in enhancing the trustworthiness of ML pipelines lies in tracing and verifying the integrity at every stage of this very complex process, assuming we can authenticate the organizations and actors who provide data or model components. Our verification process addresses this challenge at multiple levels:

Data level integrity:

Track each branch of different data sources and algorithmic choices.

Measure (hash) and digitally sign every artifact (e.g., data, preprocessing scripts, MLOps tools, transformation code).

Real-time verification:

Continuously verify that inputs and metadata from previous lifecycle stages match expectations before they are used in an operation.

Track ML model component transformations and computational environment traces.
Generate authenticated, digitally signed metadata, including source information, for each transformation.

Model lineage:

Record the interconnections and dependencies between various ML model components.
Create a verifiable chain of evidence from data preparation through model deployment through linked operational metadata.

Verification Challenges in Practice

The reality of implementing our three-layered approach pipeline-wide introduces several practical challenges:

Trust establishment: Tracking and verifying the trustworthiness of the numerous third-party organizations and actors who handle outsourced operations such as data preparation and certification, or model training, requires robust credential management and continuous monitoring.
Performance impact: Continuous measurement and verification add overhead to each pipeline stage. Organizations must carefully balance security requirements against performance needs. Checking signatures and computing integrity measurements requires computational resources that could impact training speeds.
Partial updates: In practice, organizations often need to modify individual components or stages without rebuilding the entire pipeline. How do we maintain integrity guarantees when only part of the pipeline changes? The solution involves careful design of the verification architecture to support incremental updates while maintaining security guarantees.
Operational complexity: Pipeline operators need clear indicators of integrity status and straightforward processes for handling verification failures. When a signature check fails or an integrity measurement doesn't match expected values, the system must provide actionable information to help operators understand, trace, and resolve the issue.

Technical Deep Dive: How We Bring Trust to the ML Model Lifecycle

At the core of our approach is a strategic combination of software and hardware technologies that enable us to record data integrity, trace model lineage, and perform real-time verification. Now we’ll dive deeper into a few of these techniques to show how they come together to bring end-to-end trust to the ML model lifecycle.

Confidential Computing and Security Foundations

Confidential computing is a fundamental technique we use to strengthen the integrity of ML pipelines. This approach relies on hardware-based security features to create isolated and authenticated environments for computation commonly known as hardware enclaves. Solutions such as Intel® Software Guard Extensions (Intel® SGX) and Intel® Trust Domain Extensions (Intel® TDX) create trusted execution environments where integrity measurements can be performed with hardware-backed guarantees. These systems use cryptographic hashing — a process that creates unique digital fingerprints of data — to generate integrity measurements and verify that components haven't been tampered with by bad actors.

When a model training pipeline begins, we employ confidential computing to:

Measure and attest to the authenticity and integrity of the training code.
Verify input data hasn't been modified.
Ensure configuration parameters remain unchanged.
Sign measurements with hardware-protected keys.
Record results in transparent logging systems.

Tracing ML Model Lineage: A Comprehensive Verification Approach

Think of a model's lineage as a family tree, where each branch represents different data sources, algorithmic choices, or training parameters. When examining a finished model, we need to trace back through this tree to understand its complete heritage. This includes verifying:

Origin and transformations of training data.
Specific algorithmic choices made during model development.
Training parameters selections.

For example, imagine a natural language processing model trained on multiple data sources. Some branches might contain social media text, while others include formal documents. Each data source undergoes different preprocessing steps — where social media text requires special cleaning for emojis and abbreviations, while formal documents need structure preservation. Our proposed verification framework uses existing software supply chain integrity methods to track each of these branches separately while maintaining their interconnections.

Securing the Verification Chain

Our verification process is recursive — when verifying any data or software component, the system automatically verifies all dependencies. The signing process creates a tamper-evident seal around both the data and its metadata. If someone attempts to modify either the data or its associated metadata, the signatures will no longer validate, immediately flagging potential tampering.

Rather than treating verification as a post-process step, our research suggests that real-time, continuous verification throughout an ML model's lifecycle can help catch potential issues early, before they can propagate through the rest of the pipeline.

Early Proof of Concept Implementation

Our initial proof of concept work demonstrates the practical feasibility of comprehensive ML model lifecycle security. To implement our framework, we integrate Coalition for Content Provenance and Authenticity (C2PA) standards for metadata management, a Sigstore Rekor-based transparency log for maintaining immutable records of model preprocessing and training operations, and Intel TDX for hardware-based security guarantees.

Figure1 Responsible AI integrity pipeline.png

Figure 1. High-level architecture of our ML pipeline integrity framework.

The continuous verification and transparency log make it significantly more difficult for malicious actors to introduce poisoned data, as any unauthorized modifications would be promptly flagged by the system. Our proof of concept also includes tools that enable end-to-end pipeline verification.

A performance analysis of our implementation shows promising results, with latency overheads ranging from 6-8% depending on the specific training scenario and dataset characteristics. This overhead includes metadata generation and verification, cryptographic operations for signing, transparency log updates and validation, and hardware-based attestation processes.

Trustworthy ML Pipelines for the Future

End-to-end security for the ML model lifecycle is a multi-layered challenge that requires careful attention to measurement, verification, and trust relationships. Success depends on combining technical solutions like hardware-based security, cryptography, and software supply chain integrity with practical considerations around usability and performance. As the field continues to evolve, organizations that invest in ML pipeline security will be better positioned to deploy trustworthy and responsible AI systems.

About the Authors

Marcin Spoczynski is a research scientist at Intel Labs specializing in the security of machine learning systems, with a recent primary focus on ensuring the integrity and provenance of training and RAG pipelines.

Marcela Melara is a research scientist at Intel Labs working on secure and trustworthy distributed systems. Her recent research focuses on developing novel techniques for software supply chain integrity and resilience.

Sebastian Szyller is a research scientist at Intel Labs focusing on security and privacy of machine learning. Recently, he has been working on provenance, robustness, and privacy in generative, multimodal systems.