Artificial Intelligence (AI)
Discuss current events in AI and technological innovations with Intel® employees
746 Discussions

Building Efficient Multi-Modal AI Agents with Model Context Protocol (MCP)

Jade-Worrall
Employee
2 1 2,967

Authors: 

Rahul Unnikrishnan Nair, Head of Engineering, Intel Liftoff, 

Sri Raj Aryan Karumuri, Sr Solutions Engineer, Intel Liftoff

 

Beyond Monolithic LLMs

 

Generative AI is evolving fast and so are the conversations around it. While large language models like GPT-4, Claude, and Llama have taken center stage, they come with real trade-offs: high computational costs, latency, and deployment challenges.
What if there was a more efficient approach? One that leverages smaller, specialized models working together through a standardized protocol to achieve comparable or even superior results for specific tasks?
This blog explores how the Model Context Protocol (MCP) and Intel accelerators (Intel Max Series GPUs) enable the creation of efficient, modular AI agents without relying on heavyweight frameworks. We’ll dive into a practical example: a multi-modal recipe generation system that analyzes food images, identifies ingredients, searches for relevant recipes, and generates customized cooking instructions. 

JadeWorrall_0-1747064837482.png

Image the Recipe generator agent sees

 

JadeWorrall_1-1747064837330.png

Generated Recipe

 

From LLMs to Agent Frameworks: The Evolution of AI Systems

 

The Rise of Large Language Models

 

Large Language Models (LLMs) have revolutionized AI by demonstrating remarkable capabilities across diverse tasks. These models, trained on vast corpora of text data, can generate human-like text, answer questions, translate languages, and even write code. Their key strength lies in their generality - a single model can handle a wide range of tasks without task-specific training.
However, this generality comes at a cost:

  1. Computational Demands: Running state-of-the-art LLMs requires significant computational resources.
  2. Latency Issues: Large models could introduce higher inference times.
  3. Deployment Complexity: Deploying massive models in production environments presents challenges.
  4. Black Box Nature: Understanding exactly how these models arrive at specific outputs can be difficult.

 

The Emergence of Agent Frameworks

 

To address some of these limitations and extend LLM capabilities, agent frameworks like LangChain, AutoGPT, and others emerged. These frameworks enable LLMs to:

  1. Access External Tools: Connect to databases, APIs, and other external systems
  2. Maintain Context: Preserve information across multiple interactions
  3. Follow Multi-Step Reasoning: Break complex tasks into manageable steps
  4. Adapt Dynamically: Change strategies based on intermediate results

Agent frameworks typically implement / help in implementing patterns like ReAct (Reasoning + Acting), which combines: - Reasoning: Step-by-step thinking through problems - Acting: Taking concrete actions based on reasoning

JadeWorrall_2-1747064836849.png

Diagram 1: ReAct Pattern

While agent frameworks significantly extend LLM capabilities, they often:
- Remain tightly coupled to specific LLM providers
- Lack standardization in how tools and context are provided
- It can be complex to configure and maintain
- May introduce additional latency through their orchestration layers.
By no means should these frameworks be avoided; they often provide a faster path to developing complex agents. The challenge typically emerges as these systems evolve, potentially becoming monolithic and bound by the specific constraints of the framework.

 

The Model Context Protocol (MCP): A New Paradigm

 

What is MCP?

 

The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to AI models. Think of MCP as the “USB-C port for AI” - just as USB-C provides a standardized way to connect devices to various peripherals, MCP provides a standardized way to connect AI models to different data sources and tools.
MCP was initially developed by Anthropic for their Claude AI assistant but has since been released as an open protocol that any AI system can implement. The protocol defines how AI systems can:

  1. Access Resources: Standardized ways to retrieve information
  2. Use Tools: Execute functions and receive structured results
  3. Follow Prompts: Use templates for common interaction patterns
  4. Sample Text: Generate text completions with specific parameters

 

Key Components of MCP

 

MCP consists of several core components:

  1. MCP Servers: Lightweight programs that expose specific capabilities through the standardized protocol
  2. MCP Clients: Applications that connect to MCP servers and use their capabilities
  3. Transport Layer: Defines how messages are exchanged (typically via Server-Sent Events or stdio)
  4. Message Types: Standardized formats for requests, responses, and notifications

JadeWorrall_3-1747064836929.png

Diagram 2: MCP

 

Why MCP Matters

 

MCP offers several significant advantages over traditional agent frameworks:

  1. Standardization: A common protocol for all AI systems to interact with tools and data
  2. Isolation: Clear separation between AI models and the tools they use
  3. Security: Tools run in isolated environments with explicit permissions
  4. Modularity: Easy to add, remove, or update individual components
  5. Interoperability: Switch between different AI providers without changing tools
  6. Efficiency: Use specialized models for specific tasks rather than one large model for everything

 

FastMCP: A Pythonic Implementation of MCP

 

While the MCP protocol can be implemented directly, frameworks like FastMCP make it much easier to build MCP servers and clients. FastMCP is a high-level, Pythonic framework inspired by FastAPI that simplifies MCP implementation.

 

Key Features of FastMCP

 

FastMCP provides:

  1. Simple Tool Creation: Create tools with Python function decorators
  2. Resource Management: Easily expose data as resources
  3. Prompt Templates: Define reusable interaction patterns
  4. Client Library: Connect to and use MCP servers
  5. Server Composition: Combine multiple servers into unified interfaces
  6. Async Support: Built on modern async Python

Here’s a simple example of creating an MCP tool with FastMCP:

from fastmcp import FastMCP

server = FastMCP("WeatherServer")

@server.tool()
def get_weather(location: str, unit: str = "celsius") -> str:
    """
    Get the current weather for a location.
   
    Args:
        location: City or location name
        unit: Temperature unit (celsius or fahrenheit)
       
    Returns:
        Current weather information
    """
    # Implementation details here
    return f"Weather in {location}: 22°{unit[0].upper()}, Partly Cloudy"

if __name__ == "__main__":
    server.run()

 

Building a Multi-Modal Recipe Agent with MCP

 

To keep the focus on the Model Context Protocol's application and ensure this blog post remains manageable in length, the following code snippets are illustrative. They will demonstrate the core interactions and structure, with some internal implementation details simplified or stubbed. A full production codebase would naturally be more extensive. This system will:

  1. Analyze food images to identify ingredients
  2. Search for relevant recipes
  3. Generate customized cooking instructions

 

System Architecture Overview

 

Our multi-modal recipe agent consists of three specialized MCP servers and a client orchestrator:

JadeWorrall_4-1747064836994.png

Diagram 3: Recipe Generator Agent

 

Each component has a specific role:

  1. Vision Server: Identifies food items in images using a specialized vision model
  2. Search Server: Searches for recipes based on identified ingredients
  3. LLM Server: Generates customized recipes based on ingredients and search results
  4. Client Orchestrator: Coordinates the workflow between servers

 

Component 1: Vision Server

 

The Vision Server is responsible for analyzing food images and identifying ingredients. It uses one of the best light-weight vision models we have worked with (Moondream2) optimized for food item detection.

JadeWorrall_5-1747064836696.png

Diagram 4: Vision Server

 

Implementation Details

 

The Vision Server exposes a single tool called identify_food_items that takes an image path and returns a list of identified food items.

from fastmcp import FastMCP
from transformers import AutoProcessor, AutoModelForImageTextToText
from PIL import Image

server = FastMCP("VisionServer", host="0.0.0.0", port=8000)
model_name = "vikhyatk/moondream2"
processor = AutoProcessor.from_pretrained(model_name)
model = AutoModelForImageTextToText.from_pretrained(model_name)

@server.tool()
def identify_food_items(image_path: str) -> str:
    """
    Identify food items in an image.
   
    Args:
        image_path: Path to the image file
       
    Returns:
        String with detected food items
    """
    # Load and process the image
    image = Image.open(image_path).convert("RGB")
   
    # Generate prompt for food detection
    prompt = "What food items can you see in this image? Return a JSON array of strings."
   
    # Process image and prompt
    inputs = processor(text=prompt, images=image, return_tensors="pt")
   
    # Generate response
    outputs = model.generate(**inputs, max_new_tokens=512)
    food_items = processor.decode(outputs[0], skip_special_tokens=True)
   
    return food_items

if __name__ == "__main__":
    server.run("sse")

The Vision Server is optimized for a specific task - food item detection - and doesn’t need the full capabilities of a general-purpose LLM.

 

Component 2: Search Server

 

The Search Server is responsible for finding recipes based on the identified ingredients. It uses DuckDuckGo search to find relevant recipes.

JadeWorrall_6-1747064837126.png

Diagram 5: Search Server

 

Implementation Details

 

The Search Server exposes a search_recipes tool that takes a list of ingredients and returns relevant recipe information.

from fastmcp import FastMCP
import json
import logging
from langchain_community.tools import DuckDuckGoSearchRun

server = FastMCP("SearchServer", host="0.0.0.0", port=8002)
search_tool = DuckDuckGoSearchRun()

@server.tool()
def search_recipes(ingredients: str) -> str:
    """
    Search for recipes based on provided ingredients.
   
    Args:
        ingredients: List of ingredients to search recipes for
       
    Returns:
        JSON string with recipe information
    """
    # Search for recipes
    query = f"recipes with {ingredients} easy homemade"
    search_results = search_web(query)
   
    # Extract recipe names from search results
    recipes = []
    if search_results and len(search_results) > 100:
        for line in search_results.split("\n"):
            recipes.append(line.strip())
        recipes = recipes[:5] if len(recipes) > 5 else recipes
   
    # Return formatted results
    result = {
        "ingredients": ingredients,
        "recipes": recipes,
        "full_results": search_results[:1000] if search_results else "",
    }
   
    return json.dumps(result)

@server.tool()
def search_web(query: str) -> str:
    """
    Search the web for information using DuckDuckGo.
   
    Args:
        query: The search query
       
    Returns:
        Search results as text
    """
    try:
        search_results = search_tool.invoke(query)
        return search_results
    except Exception as e:
        return "No search results found."

if __name__ == "__main__":
    server.run("sse")

The Search Server demonstrates how MCP can integrate with existing tools and libraries like LangChain’s DuckDuckGoSearchRun.

 

Component 3: LLM Server

 

The LLM Server generates customized recipes based on the identified ingredients and search results. It uses a smaller, more efficient language model (Qwen2.5-3B-Instruct) that’s specialized for text generation.

JadeWorrall_7-1747064836954.png

Diagram 6: LLM server

 

Implementation Details

 

The LLM Server exposes a generate_recipe tool that takes ingredients and search results and returns a customized recipe.

from fastmcp import FastMCP
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from typing import Any

server = FastMCP("LLMServer", host="0.0.0.0", port=8001)

# Initialize the model and tokenizer
model_name = "Qwen/Qwen2.5-3B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Create a text generation pipeline
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

@server.tool()
def generate_recipe(ingredients: Any, search_results: str = "", max_new_tokens: int = 512) -> str:
    """
    Generate a recipe based on provided ingredients and optional search results.
   
    Args:
        ingredients: List of ingredients
        search_results: Optional search results to inform recipe generation
        max_tokens: Maximum number of tokens to generate
       
    Returns:
        Generated recipe text
    """
    # Prepare the prompt
    prompt_parts = [f"Based on these ingredients: {ingredients}"]
   
    if search_results and len(search_results) > 10:
        prompt_parts.append(f"And considering these recipe ideas: {search_results}")
   
    prompt_parts.append("Create a detailed recipe with the following format:")
    prompt_parts.append("Recipe name: [Creative name]")
    prompt_parts.append("Brief description: [Short description]")
    prompt_parts.append("Ingredients: [List of ingredients with quantities]")
    prompt_parts.append("Simple step-by-step instructions: [Numbered steps]")
    prompt_parts.append("Cooking time: [Time in minutes]")
    prompt_parts.append("Servings: [Number of servings]")
   
    prompt = "\n".join(prompt_parts)
   
    # Generate the recipe
    response = generator(
        prompt,
        max_new_tokens=max_new_tokens,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
    )
   
    # Extract and format the generated text
    generated_text = response[0]["generated_text"]
    recipe_text = generated_text[len(prompt):].strip()
   
    return recipe_text

if __name__ == "__main__":
    server.run("sse")

The LLM Server, similar to our ingredients detector, uses a smaller, efficient language model (3B parameters) that’s fine-tuned specifically for instruction following and text generation. This specialized model provides excellent recipe generation capabilities without the computational overhead of a massive general-purpose LLM.

 

Component 4: Client Orchestrator

 

The Client Orchestrator coordinates the workflow between the three specialized servers. It handles user input, manages the sequence of operations, and presents the final output.

JadeWorrall_8-1747064837010.png

Diagram 7: Client Orchestrator

 

Implementation Details

 

The Client Orchestrator uses the MCP client library to connect to and interact with the three specialized servers.

import argparse
import asyncio
import json
import logging
import os
from typing import Any, Dict

from fastmcp import Client

class MultiModalAgent:
    """Multi-Modal Agent for analyzing food images and suggesting recipes"""

    def __init__(self):
        # Server configurations
        self.vision_server_url = os.environ.get("VISION_SERVER_URL", "http://localhost:8000")
        self.llm_server_url = os.environ.get("LLM_SERVER_URL", "http://localhost:8001")
        self.search_server_url = os.environ.get("SEARCH_SERVER_URL", "http://localhost:8002")
       
        # Retry configuration
        self.max_retries = 3
        self.retry_delay = 5  # seconds

    async def suggest_recipe(self, image_path: str) -> Dict[str, Any]:
        """Analyze a food image and suggest recipes"""
        print("🧠 Initializing Multi-Modal Recipe Agent...")
        print(f"️ Analyzing food image file: {image_path}")
       
        # Step 1: Identify food items in the image
        food_items = await self._identify_food_items(image_path)
        print(" Detected food items")
       
        # Step 2: Search for recipes based on the identified ingredients
        recipe_search = await self._get_recipe_suggestions(food_items)
        print(" Found recipe ideas")
       
        # Step 3: Generate a customized recipe
        recipe = await self._generate_recipe(food_items, recipe_search)
        print(" Recipe Suggestion:")
       
        return {
            "food_items": food_items,
            "recipe": recipe
        }

    async def _call_mcp_tool(self, server_url: str, tool_name: str, params: Dict[str, Any]) -> Any:
        """Call an MCP tool with simple retry logic"""
        for attempt in range(self.max_retries + 1):
            try:
                client = Client(f"{server_url}/sse")
                async with client:
                    result = await asyncio.wait_for(
                        client.call_tool(tool_name, params), timeout=60.0
                    )
                    return result
            except Exception as e:
                if attempt < self.max_retries:
                    await asyncio.sleep(self.retry_delay)
        return None

    async def _identify_food_items(self, image_path: str) -> str:
        """Identify food items in image using the Vision Server"""
        result = await self._call_mcp_tool(
            self.vision_server_url, "identify_food_items", {"image_path": image_path}
        )
        return self._extract_text_from_response(result)

    async def _get_recipe_suggestions(self, food_items: str) -> str:
        """Get recipe suggestions using Search Server"""
        search_result = await self._call_mcp_tool(
            self.search_server_url, "search_recipes", {"ingredients": food_items}
        )
        return self._extract_text_from_response(search_result)

    async def _generate_recipe(self, ingredients: str, search_results: str) -> str:
        """Generate a recipe using LLM Server"""
        llm_params = {
            "ingredients": str(ingredients),
            "search_results": str(search_results),
            "max_tokens": 1000,
        }
        recipe_result = await self._call_mcp_tool(
            self.llm_server_url, "generate_recipe", llm_params
        )
        return self._extract_text_from_response(recipe_result)

    def _extract_text_from_response(self, response: Any) -> str:
        """Helper method to extract text content from MCP responses"""
        return str(response)

async def main():
    """Main function to run the multi-modal agent for recipe suggestions"""
    parser = argparse.ArgumentParser(
        description="Multi-Modal Agent for Recipe Suggestions"
    )
    parser.add_argument(
        "--image", type=str, required=True, help="Path to food image file"
    )
    args = parser.parse_args()

    agent = MultiModalAgent()
    result = await agent.suggest_recipe(args.image)
    print(result["recipe"])

if __name__ == "__main__":
    asyncio.run(main())

The Client Orchestrator demonstrates how MCP enables the composition of specialized services into a cohesive workflow. Each server focuses on a specific task, and the orchestrator manages the flow of information between them.
While traditional agent frameworks like those using the ReAct pattern rely on a model with explicit reasoning steps, the MCP-based multi-modal agent above takes a different approach. It distributes intelligence across specialized components, each optimized for its specific modality or task, while still maintaining the core capability of processing and integrating multiple types of data (images and text) into a cohesive output.

 

Why This Architecture Matters: The Power of Specialization

 

The multi-modal recipe agent demonstrates several key advantages of the MCP approach:

 

1. Efficiency Through Specialization

 

Each component in our system is optimized for a specific task:

  • Vision Server: Uses a fast vision model for food item detection
  • Search Server: Focuses on web search and result extraction
  • LLM Server: Uses a small, efficient language model for text generation

This specialization allows us to achieve excellent results with significantly lower computational requirements than using a single massive model for everything.

 

2. Isolation and Security

 

Each server runs in its own isolated environment with clearly defined inputs and outputs. This isolation provides several benefits:

  • Security: Each component has only the permissions it needs
  • Reliability: Issues in one component don’t affect others
  • Maintainability: Components can be updated independently

 

3. Flexibility and Interoperability

 

The MCP architecture makes it easy to:

  • Swap Components: Replace any server with an alternative implementation
  • Add Capabilities: Extend the system with new servers and tools
  • Scale Independently: Allocate resources based on each component’s needs

 

4. Reduced Latency

 

By using specialized models and efficient communication patterns, the MCP approach can achieve lower end-to-end latency than monolithic systems. Each component does exactly what it needs to do, without the overhead of a massive general-purpose model.

 

Containerization and Deployment

 

One of the key advantages of our MCP-based architecture is the ease of containerization and deployment. Each server can be packaged as a separate Docker container, allowing for independent scaling, updates, and resource allocation.

 

Docker Containerization

 

For our multi-modal recipe agent, we created separate Docker containers for each server and the client orchestrator:

JadeWorrall_9-1747064836901.png

Diagram 8: Container Orchestration

 

Here’s an example Dockerfile for the Vision Server:

FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Create cache directory with proper permissions
RUN mkdir -p /app/.cache && \
    chmod -R 777 /app/.cache

COPY servers/vision_server.py .

CMD ["python", "vision_server.py"]

Using Docker Compose, we can easily orchestrate the deployment of all services:

services:
  vision-server:
    build:
      context: .
      dockerfile: docker/Dockerfile.vision
    ports:
      - "8000:8000"
    volumes:
      - ./data:/app/data

  search-server:
    build:
      context: .
      dockerfile: docker/Dockerfile.search
    ports:
      - "8002:8002"

  llm-server:
    build:
      context: .
      dockerfile: docker/Dockerfile.llm
    ports:
      - "8001:8001"

  client:
    build:
      context: .
      dockerfile: docker/Dockerfile.client
    depends_on:
      - vision-server
      - search-server
      - llm-server
    volumes:
      - ./data:/app/data

This containerized approach provides several benefits:

  1. Isolation: Each service runs in its own container with only the dependencies it needs. This isolation is especially critical because MCP servers can also execute arbitrary code; therefore, under no circumstances should these services be run directly on the host system.
  2. Portability: The entire system can be deployed on any platform that supports Docker
  3. Scalability: Individual services can be scaled independently based on demand
  4. Versioning: Each service can be versioned and updated independently

 

Optimizing for Intel GPUs

 

One additional advantage of our modular approach is the ability to optimize each component for specific hardware. In our case, we can leverage Intel® Data Center GPU Max 1100 for efficient inference across all components.

 

Intel Extension for PyTorch (IPEX)

 

Intel Extension for PyTorch (IPEX) is a library that extends PyTorch with optimizations for Intel hardware. It can significantly improve the performance of PyTorch models on Intel CPUs and GPUs.
Here’s how we can modify our LLM Server to use IPEX:

import intel_extension_for_pytorch as ipex
from fastmcp import FastMCP
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from typing import Any

server = FastMCP("LLMServer", host="0.0.0.0", port=8001)

# Initialize the model and tokenizer
model_name = "Qwen/Qwen2.5-3B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Optimize the model with IPEX
model = ipex.llm.optimize(model) # optional to add dtype

# Create a text generation pipeline
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

 

Similar optimizations can be applied to the Vision Server and any other components that use PyTorch models.

 

Lessons Learned: Insights from Building a Multi-Modal MCP Agent

 

Developing our multi-modal recipe agent provided valuable insights that can benefit others building similar systems:

 

1. Simplicity Over Complexity

 

One of our key learnings was the value of simplicity. Initially, we considered using complex agent frameworks like LangChain for the search functionality. However, we found that direct implementation using the DuckDuckGo search library provided better control and reduced dependencies.

# Before: Using LangChain's wrapper
from langchain.agents import Tool
from langchain_community.tools import DuckDuckGoSearchRun
search_tool = DuckDuckGoSearchRun()
# After: Direct implementation with the library
from duckduckgo_search import DDGS
ddgs = DDGS()
results = ddgs.text(query, max_results=10)

The lesson: Always question whether you need the full complexity of a framework or if a simpler, more direct approach would suffice.

 

2. Clear Interfaces Simplify Development

 

The MCP protocol enforces clear interfaces between components, which significantly simplified development and testing. Each server could be developed and tested independently, with well-defined inputs and outputs.
This approach allowed us to:
- Develop components in parallel
- Test components in isolation
- Replace implementations without affecting other parts of the system

 

3. User Experience Matters

 

Even in a technical system, user experience considerations are important. We found that adding simple visual cues and clear status messages significantly improved the user’s understanding of the system’s operation.

 

4. Containerization Simplifies Deployment and Process level Isolation

 

Using Docker containers for each component made deployment and testing much simpler.
We could easily:
- Test different configurations
- Deploy to different environments
- Scale individual components based on demand
The containerized approach also ensured consistency between development and production environments, reducing the “it works on my machine” problem.

 

The Future of Gen AI is Modular

 

The multi-modal recipe agent we’ve explored demonstrates a powerful alternative to monolithic LLM-based systems. By leveraging the Model Context Protocol (MCP) and specialized components, we can create AI systems that are:

  1. More Efficient: Using the right tool for each job
  2. More Secure: Isolating components and limiting permissions
  3. More Flexible: Easily swapping or upgrading components
  4. More Maintainable: Clearly defined interfaces and responsibilities
  5. More Resilient: Handling failures gracefully through proper error management
  6. More User-Friendly: Providing clear feedback and intuitive interactions

This approach represents a shift from the “one massive model for everything” paradigm to a more modular, specialized architecture. As AI continues to evolve, we expect to see more systems adopt this approach, combining the strengths of different models and tools through standardized protocols like MCP.
The future of AI is about building smarter systems, ones made of specialized parts that work together with purpose.MCP provides the standardized “connective tissue” that makes this possible, opening up new possibilities for efficient, powerful AI applications that can run on a variety of hardware, from powerful servers to edge devices.
This modularity extends to diverse architectural strategies. While our recipe agent showcases an MCP-native orchestration for optimal control and leanness, MCP can also serve as a robust foundation for tools within hybrid architectures. In such scenarios, agent development frameworks could manage high-level planning, reasoning (e.g., using ReAct-like patterns), and conversational flow, while relying on MCP for standardized, secure, and efficient access to a rich ecosystem of specialized AI models, data sources, and traditional tools. This allows teams to leverage the strengths of both approaches – sophisticated agentic control from frameworks and a cleanly defined, interoperable service layer via MCP.
By using this modular approach, we can create AI systems that are not only more capable but also more accessible, efficient, and adaptable to specific needs. The multi-modal recipe agent is just one example of what’s possible when we break free from the constraints of monolithic models and embrace the power of specialized, interconnected components.

Try it Yourself on Intel® Tiber™ AI Cloud: You can run this code and explore the performance of the Intel Data Center GPU Max 1100 directly. Intel Tiber™ AI Cloud offers:

  • Free JupyterLab Environment: Get hands-on access to a Max 1100 GPU for training and experimentation by creating an account at cloud.intel.com and launching a GPU-accelerated notebook from the "Training" section.
  • Virtual Machines & Bare Metal: Access single Max 1100 GPU VMs (starting at $0.39/hr/card) or powerful multi-GPU systems connected via high-speed bridges. PoC credits are available for qualifying AI startups via the  Intel® Liftoff program. Find more details on the Intel Tiber™ AI Cloud Pricing Page.

 

References

 

  1. Model Context Protocol (MCP): https://modelcontextprotocol.io/
  2. FastMCP: https://gofastmcp.com/
  3. Anthropic Claude: https://www.anthropic.com/claude
  4. Intel Extension for PyTorch: https://github.com/intel/intel-extension-for-pytorch
  5. ReAct: Synergizing Reasoning and Acting in Language Models: https://arxiv.org/abs/2210.03629
  6. DuckDuckGo Search Python Library: https://github.com/deedy5/duckduckgo_search
  7. Moondream2 Vision Model: https://huggingface.co/vikhyatk/moondream2
  8. Qwen2.5-3B-Instruct Model: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct

 

Related resources

 

Intel® Tiber™ AI Cloud - Cloud platform for AI development and deployment
Intel® Gaudi® 2 AI accelerator - High-performance AI training processor designed for deep learning workloads

 

1 Comment
KitturGanesh
Employee

Excellent article, Rahul; especially since MCP is gaining a lot of traction in the AI domain and the community!