Examine critical components of Cloud computing with Intel® software experts
119 Discussions

Leveraging Intel® Advanced Matrix Extensions (Intel® AMX) in Amazon EC2 C7i for Inference (part 1/2)

0 0 15.3K

Intel® Advanced Matrix Extensions (Intel® AMX) in Amazon EC2 C7i Instances:

Amazon EC2 C7i[i] instances are a new generation of compute-optimized instances powered by custom 4th Gen Intel® Xeon® Scalable processors. These instances are designed to deliver enhanced performance for compute-intensive workloads, and one of their key features is the support for Intel Advanced Matrix Extensions (Intel AMX).

What is Intel AMX?

Intel Advanced Matrix Extensions[ii] (Intel AMX) is an x86 instruction set extension designed to accelerate matrix operations, which are fundamental in deep learning and AI workloads. Intel AMX introduces two main components: a 2-dimensional register file called "tiles" and an accelerator for Tile Matrix Multiplication (TMUL). These components enable efficient handling of matrix multiplications, significantly boosting AI and machine learning performance.

Benefits of Intel AMX in C7i Instances:

  1. Performance Improvement: C7i instances leverage Intel AMX to accelerate matrix multiplication operations, which are crucial for CPU-based machine learning (ML) applications. This results in up to 15% better price-performance than previous generation C6i instances.
  2. Enhanced AI Capabilities: Intel AMX supports lower precision data types like INT8 and BF16, essential for AI inference and training. This allows C7i instances to perform 2,048 INT8 operations or 1,024 BF16 operations per cycle, significantly improving throughput for AI workloads.
  3. Scalability: C7i instances offer larger instance sizes and support up to 128 EBS volume attachments, enabling the processing of larger datasets and scaling of workloads more efficiently.

Technical Specifications

Processor: Custom 4th Gen Intel Xeon Scalable processors.

Memory: Latest DDR5 memory, offering more bandwidth compared to DDR4.

Instance Sizes: Up to 192 vCPUs and 384 GiB memory.

Built-in Accelerators: Data Streaming Accelerator (DSA), In-Memory Analytics Accelerator (IAA), and QuickAssist Technology (QAT) for efficient offload and acceleration of data operations.

Amazon EC2 C7i instances, with their support from Intel AMX, provide a powerful platform for running compute-intensive and AI workloads, offering significant performance improvements and scalability options.

Importance of AI Inference in Retail, Finance, and Healthcare

Artificial Intelligence (AI) inference refers to using a trained AI model to make predictions or decisions based on new, unseen data. This capability is transforming industries by enabling more efficient, accurate, and personalized services. Healthcare, finance, and retail sectors have seen significant advancements and benefits from AI inference.


AI is pivotal in delivering personalized and seamless shopping experiences to customers. By leveraging machine learning algorithms and data analytics, retailers can gain valuable insights into customer preferences, behavior patterns, and purchase histories. This allows them to provide highly relevant product recommendations, targeted marketing campaigns, and tailored promotions. Additionally, AI-powered chatbots and virtual assistants offer round-the-clock customer support, addressing queries and guiding customers through their shopping journey. These personalized interactions improve customer satisfaction, foster brand loyalty, and drive sales.

AI is instrumental in streamlining retail operations, enhancing efficiency, and optimizing decision-making processes. Predictive analytics powered by AI enables accurate demand forecasting, inventory management, and dynamic pricing strategies. AI-driven supply chain optimization helps retailers minimize costs, reduce waste, and ensure timely product availability. Furthermore, AI-powered computer vision and sensor technologies aid in monitoring stock levels, automating checkout processes, and preventing theft, improving operational efficiency and profitability. By harnessing the power of AI, retailers can stay ahead of the competition, adapt to changing market dynamics, and unlock new growth opportunities.


AI inference is revolutionizing how financial institutions manage risk, detect fraud, and interact with customers. AI models analyze vast amounts of transaction data in real time to identify patterns that may indicate fraudulent activity. This capability allows financial institutions to quickly flag and investigate suspicious transactions, significantly reducing the risk of financial loss.

AI-driven chatbots and virtual assistants use inference to provide personalized customer service, answer queries and offer financial advice tailored to individual user profiles. This not only enhances customer experience but also boosts operational efficiencies.


In the healthcare sector, AI inference plays a critical role in enhancing patient care and operational efficiency. AI models are used to interpret complex medical data to assist in diagnosis and treatment planning. For example, AI systems can analyze X-rays and MRI scans more quickly and accurately than human radiologists, identifying signs of diseases such as cancer or neurological disorders early on. This not only speeds up the diagnostic process but also improves patient outcomes by enabling timely and targeted treatment interventions.

AI inference also supports personalized medicine approaches by predicting individual responses to different treatments based on a patient’s genetic makeup and medical history. AI-driven tools also help manage healthcare resources more effectively by predicting patient admission rates to optimize staffing and bed allocation.

Inference use cases that we will look at with C7i and AMX are:

  • Retail Services to allow for frictionless shopping
  • Financial Services for natural language processing (NLP)
  • Augmented Healthcare to improve clinician productivity

In part 2 of the blog, we will look at these use cases with specific industry examples across retail, finance, and healthcare.


  1. The Amazon EC2 C7i instance
  2. Intel Advanced Matrix Extensions (AMX) 


Performance varies by use, configuration and other factors. Learn more at Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure. Your costs and results may vary. Intel technologies may require enabled hardware, software or service activation.

The analysis in this document was done by VLSS and commissioned by Intel. 

About the Author
Mohan Potheri is a Cloud Solutions Architect with more than 20 years in IT infrastructure, with in depth experience on Cloud architecture. He currently focuses on educating customers and partners on Intel capabilities and optimizations available on Amazon AWS. He is actively engaged with the Intel and AWS Partner communities to develop compelling solutions with Intel and AWS. He is a VMware vExpert (VCDX#98) with extensive knowledge on premises and hybrid cloud. He also has extensive experience with business critical applications such as SAP, Oracle, SQL and Java across UNIX, Linux and Windows environments. Mohan Potheri is an expert on AI/ML, HPC and has been a speaker in multiple conferences such as VMWorld, GTC, ISC and other Partner events.