Efficient PDF Summarization with CrewAI and Intel® XPU Optimization

Eugenie_Wirz · ‎07-29-2025

Authors: Rahul Unnikrishnan Nair & Rahila T from Intel Liftoff for AI Startups program

PDF summarization with CrewAI—a multi-agent framework that simplifies PDF content extraction and summarization using language models. In this blog, we demonstrate how to build and run a PDF Summarizer Agent using Intel® XPU-optimized tooling for efficient inference, leveraging CrewAI’s orchestration, PyPDF2 for text extraction, and a T5-based model for summarization.

What sets this solution apart is the Intel® hardware acceleration, which provides a performance boost for real-time summarization of large PDF documents. By utilizing Intel® Xe GPUs and Intel® Extensions for PyTorch (IPEX), we achieve faster and more efficient document processing, enabling us to summarize long and complex documents in a fraction of the time. This hardware optimization makes Intel® XPU an essential part of any PDF summarization pipeline, ensuring both speed and scalability.

Through this blog, we’ll walk you through:

Setting up a Python virtual environment for seamless integration.
Installing essential packages for PDF extraction and summarization.
Building custom tools for PDF text extraction and summarization using CrewAI.
Defining agents and tasks in CrewAI to streamline the workflow.
Running the summarization pipeline with Intel-optimized PyTorch models for efficient AI inference.

This solution is lightweight, modular, and optimized for Intel-powered infrastructure, ideal for real-time document summarization use cases across industries like legal, healthcare, and research.

How Intel Powers the PDF Summarization Process:

The core of this PDF summarization system’s efficiency lies in Intel®’s hardware accelerators. By utilizing Intel® Xe GPUs and Intel® Extensions for PyTorch (IPEX), we can significantly boost the performance of our AI models, enabling them to process large PDF documents much faster. Here’s how Intel® technology accelerates the process:

Intel® Xe GPUs: These GPUs are designed for compute-heavy workloads like NLP tasks and deep learning inference. Their architecture optimizes matrix and vector operations, crucial for the parallelized computations required for deep learning models used in summarization.
Intel® XPU and IPEX for PyTorch: Intel® Extensions for PyTorch (IPEX) optimizes the performance of deep learning models on both Intel® CPUs and GPUs. By accelerating model inference, IPEX ensures that the summarization tasks are processed efficiently and quickly, even for large PDFs containing vast amounts of data.
Real-time Summarization: With Intel®’s optimized hardware, this system enables real-time summarization, ensuring that even large documents, such as legal case files, academic research papers, or business reports, can be processed and summarized in near real-time without compromising quality.
Energy Efficiency: Intel®’s hardware is also optimized for energy efficiency, making it ideal for enterprise-grade applications and deployments where computational power and sustainability are both priorities.

Why CrewAI for Document Summarization?

CrewAI is a lightweight, Python-based orchestration library that enables modular AI systems by combining tools, agents, and tasks in configurable workflows. For document summarization tasks:

Multi-Agent Capabilities: Clearly separates responsibilities between PDF parsing and summarization agents.
Custom Tools: Easily define domain-specific logic using user-defined tools.
Intel® XPU Support: Use PyTorch’s xpu backend to efficiently run transformers on Intel® GPU/CPU.
Sequential Task Execution: Enables deterministic processing from raw documents to refined summaries.

Step-by-Step Guide to Building the PDF Summarizer

Step 1: Set Up Python Virtual Environment

Start by creating a Python virtual environment and installing the necessary packages:

python -m venv crewAI_env
source crewAI_env/bin/activate  # For Windows: crewAI_env\Scripts\activate
python -m pip install crewai crewai-tools PyPDF2 transformers
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/test/xpu
python -m pip install ipykernel
python -m ipykernel install --user --name=crewAI_env

In your Jupyter notebook, select the kernel crewAI_env.

Step 2: Loading Your OpenAI API Key

For integration with OpenAI models, store your API key in a .env file:

OPENAI_API_KEY="your_api_key_here"

Then, load the environment variables in your notebook:

from dotenv import load_dotenv
load_dotenv()

Step 3: Import Required Libraries

In this step, you’ll need to import libraries like CrewAI, PyPDF2, and Hugging Face’s transformers for summarization:

from crewai import Agent, Task, Crew, Process
from crewai.tools import BaseTool
from PyPDF2 import PdfReader
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

Step 4: Define Custom Tools

CrewAI lets you define custom tools for specific tasks like reading PDF files and summarizing text.

PDF Reader Tool: Extracts text from PDF files.
Summarizer Tool: Summarizes text using a pre-trained T5 model.

class PDFReaderTool(BaseTool):
    name: str = "PDF Reader"
    description: str = "Reads the content of a PDF file and returns the text."
   
    def _run(self, pdf_path: str) -> str:
        reader = PdfReader(pdf_path)
        return "".join([page.extract_text() for page in reader.pages])

pdf_reader_tool = PDFReaderTool()

class SummarizerTool(BaseTool):
    name: str = "LLM Summarizer"
    description: str = "Summarize the text provided and return concise summaries."
   
    def _run(self, text: str) -> str:
        model_name = "google-t5/t5-small"
        tokenizer = AutoTokenizer.from_pretrained(model_name)
        model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to("xpu")
        inputs = tokenizer("summarize: " + text, return_tensors="pt", truncation=True).to("xpu")
        summary_ids = model.generate(**inputs)
        return tokenizer.decode(summary_ids[0], skip_special_tokens=True)

llm_summarizer = SummarizerTool()

Step 5: Define the Agents

Agents in CrewAI are responsible for executing tasks. Here, we define two agents:

Reader Agent: Responsible for extracting text from PDFs.
Summarizer Agent: Handles the summarization of text.

reader_agent = Agent(
    role="Reader",
    goal="Extract text from PDF documents.",
    verbose=True,
    memory=True,
    backstory="You are an expert in extracting text from PDF documents.",
    tools=[pdf_reader_tool],
    allow_delegation=True
)

summarizer_agent = Agent(
    role="Summarizer",
    goal="Summarize the content of documents.",
    verbose=True,
    backstory="You are skilled at summarizing long documents into concise summaries.",
    tools=[llm_summarizer],
    allow_delegation=False
)

Step 6: Create Tasks

Next, we define tasks for each agent. Tasks describe the actions agents should perform.

Read PDF Task: Extract text from the PDF.
Summarize Text Task: Summarize the extracted text.

read_pdf_task = Task(
    description="Read the content of the PDF document located at {pdf_path}",
    expected_output="Text extracted from the PDF document.",
    tools=[pdf_reader_tool],
    agent=reader_agent
)

summarize_text_task = Task(
    description="Summarize the provided text using the T5-small model.",
    expected_output="A concise and accurate summary of the document's content.",
    tools=[llm_summarizer],
    agent=summarizer_agent
)

Step 7: Run the Workflow

Now, set up the orchestration using CrewAI’s Crew class and run the workflow:

crew = Crew(
    agents=[reader_agent, summarizer_agent],
    tasks=[read_pdf_task, summarize_text_task],
    process=Process.sequential
)

result = crew.kickoff(inputs={"pdf_path": "sample.pdf"})
print(result)

Example Output

Given an input PDF of an Intel whitepaper titled “Intel® Xe Graphics for AI Workloads”

Input Text:

“Intel® Xe graphics architecture represents a revolutionary step in GPU design, delivering optimized performance for AI and machine learning applications. Xe GPUs are built to handle complex data processing tasks, such as neural network training and inference, with parallel computing capabilities that allow for greater throughput and lower latency. The scalable architecture of Intel® Xe offers solutions for a wide range of workloads, from edge computing to large-scale data centers.

In addition to raw computational power, Intel® Xe GPUs are designed with energy efficiency in mind, making them ideal for high-performance computing in data centers. By integrating Intel®’s latest AI technologies, such as Intel® oneAPI and Intel® AI Analytics Toolkit, Xe GPUs provide a unified programming model that simplifies deployment across diverse platforms. With robust support for deep learning frameworks like TensorFlow, PyTorch, and MXNet, Xe GPUs enable developers to accelerate AI workloads without needing to rewrite existing code.

The architecture also incorporates Intel®’s hardware-accelerated AI technologies, including AVX-512 and DL Boost, which further optimize model inference times. These innovations ensure that Xe GPUs provide industry-leading performance for AI applications such as natural language processing (NLP), image recognition, and predictive analytics. Whether used in edge devices for real-time inference or in large AI research labs, Intel® Xe GPUs deliver the speed and flexibility required to support the next generation of AI-driven applications.”

Summary:

“Intel® Xe graphics architecture is optimized for AI and machine learning workloads, offering scalable solutions from edge computing to data centers. With powerful parallel computing capabilities, Xe GPUs accelerate neural network training and inference, while ensuring energy efficiency. Intel®’s AI technologies, including oneAPI and AI Analytics Toolkit, simplify deployment across platforms and support frameworks like TensorFlow and PyTorch. Xe GPUs also feature hardware-accelerated technologies such as AVX-512 and DL Boost for faster model inference, making them ideal for AI applications in NLP, image recognition, and predictive analytics.”

Conclusion

This article demonstrated how to build an AI-powered PDF summarization tool using Python, transformers, and CrewAI agents. By leveraging Intel® XPU for hardware acceleration, you can automate the task of summarizing large volumes of text with remarkable efficiency.

Try it yourself and improve your productivity in dealing with documents!

Key Takeaways

Modular AI System: CrewAI allows for easy integration of custom tools and agents to handle different tasks in the summarization pipeline.
Intel® XPU Optimization: The solution runs efficiently on Intel® hardware, delivering faster and cost-effective results.
Extensibility: This setup can be easily expanded to support other document-related tasks like classification, translation, or indexing.

Troubleshooting & Common Issues

Here are a few common issues you might encounter during deployment and their solutions:

Issue	Cause	Solution
xpu device not found	PyTorch not installed with xpu support	Install PyTorch with Intel XPU support by running: pip install intel_extension_for_pytorch
Text extraction returns None	PDF contains scanned images instead of text	Use OCR tools like pytesseract
Memory overflow or device crash	Insufficient memory or device compatibility	Use a smaller model or check system resources
API key issues	.env file not properly loaded	Verify the .env file and API key setup
Token limit exceeded	Document too long for a single input prompt	Split document into chunks for summarization