Overview

Leveraging Arm CPUs for outstanding AI Inference

Evaluating a CPU for AI workloads requires understanding its role as the “thinking” layer in AI systems. While accelerators perform the mathematical operations behind AI models, CPUs act as the head node that orchestrates workloads, manages data flow, and turns compute into real-world outcomes. As AI shifts toward agentic, always-on inference, CPUs are increasingly critical for delivering power-efficient, persistent performance across data centers and edge environments.

Guide to Understanding CPU Inference

AI on CPU

Which AI Workloads Run Best on the CPU?

Benefits

Benefits of the Arm CPU for AI

Success Stories

Partner Innovation

Compute Platform

Foundation for AI from Edge to Cloud

Two Decades of AI Architecture Innovation

Arm has a focus on fast-paced architectural innovation that prepares our vast ecosystem for ever-changing compute requirements and the future of AI. Arm has consistently and proactively evolved the AI capabilities of our CPUs over two decades with features such as Neon, Helium, Scalable Vector Extension (SVE), Scalable Matrix Extension 2 (SME2), and others. The latest Armv9 architecture features are driving increased compute performance, alongside reduced power consumption for AI workloads.

Armv9 Architecture

Arm’s partnerships with the leading AI frameworks and operating systems helps ensure fast and easy deployment for scaling AI workloads across Arm CPUs. We support key partners with techniques for optimizing new models using quantization and open-source software for AI acceleration, such as Arm Kleidi, used by frameworks and independent software vendors. Targeting accelerations at the AI framework level helps drive AI accelerations on Arm CPUs most broadly, across billions of AI inference installs for workloads at the edge, on mobile, and in the cloud. Without any extra optimization effort, application developers can expect best performance for their AI workloads by default on Arm CPUs, thanks to our work across gaming, computer vision, and language models.

Discover Arm Kleidi

Key Takeaways

CPUs serve as the central “thinking” layer that orchestrates AI systems alongside accelerators.
AI workloads rely on CPUs to manage data movement and convert compute into usable outcomes.
Agentic AI workloads are persistent and always-on, increasing the importance of CPU efficiency.
Power-efficient CPUs are essential for scaling AI across both data centers and edge environments.
Evaluating CPUs for AI requires assessing orchestration capability, scalability, and performance per watt.

Leveraging Arm CPUs for outstanding AI Inference

Guide to Understanding CPU Inference

Which AI Workloads Run Best on the CPU?

Always-on or power-constrained inference

General purpose system-level compute

Benefits of the Arm CPU for AI

Designed for power-efficient performance at scale

Built for the future of the AI data center

Open, ecosystem-driven adoption

Partner Innovation

Agentic AI

Bringing Enterprise-Ready AI With Arcee AI

Mobile AI

Unlocking AI Technologies with Meta

Mobile AI

Stability AI: Transforming On-Device Audio AI

Foundation for AI from Edge to Cloud

Two Decades of AI Architecture Innovation

Latest News and Resources

Interactive Image Segmentation Using SqueezeSAM

Why CPUs Sit at the Center of AI Infrastructure

The Shift to Purpose-Built Cloud AI

Arm Lumex CSS Platform Launches to Enable the Next Era of Mobile AI

Fueling Seamless AI at Scale

World’s First Armv9 Edge AI Platform

Democratizing AI at the Edge with Arm and ExecuTorch

Scaling AI Inference with Meta’s New Llama 3.2 LLMs

Accelerate Popular Hugging Face Models with Arm Neoverse

PyTorch and ExecuTorch Integrations Deliver Performance Uplifts

AI Report for Enterprises

The New Frontier for Edge AI

AI Report for Enterprises

Arm AI Readiness Index

AI Innovation on Arm

Silicon Reimagined in the Age of AI

AI From Cloud to Edge

Why Software is Crucial to Achieving AI’s Full Potential

Generative AI on Arm

Scale Generative AI With Flexibility and Speed

AI on Mobile

Redefining Mobile Experiences With AI

Key Takeaways

Subscribe to the Latest AI News from Arm

Arm Account

Leveraging Arm CPUs for outstanding AI Inference

AI Summary

Guide to Understanding CPU Inference

Which AI Workloads Run Best on the CPU?

Always-on or power-constrained inference

General purpose system-level compute

Benefits of the Arm CPU for AI

Designed for power-efficient performance at scale

Built for the future of the AI data center

Open, ecosystem-driven adoption

Partner Innovation

Agentic AI Bringing Enterprise-Ready AI With Arcee AI

Mobile AI Unlocking AI Technologies with Meta

Mobile AI Stability AI: Transforming On-Device Audio AI

Foundation for AI from Edge to Cloud

Two Decades of AI Architecture Innovation

Latest News and Resources

Interactive Image Segmentation Using SqueezeSAM

Why CPUs Sit at the Center of AI Infrastructure

The Shift to Purpose-Built Cloud AI

Arm Lumex CSS Platform Launches to Enable the Next Era of Mobile AI

Fueling Seamless AI at Scale

World’s First Armv9 Edge AI Platform

Democratizing AI at the Edge with Arm and ExecuTorch

Scaling AI Inference with Meta’s New Llama 3.2 LLMs

Accelerate Popular Hugging Face Models with Arm Neoverse

PyTorch and ExecuTorch Integrations Deliver Performance Uplifts

AI Report for Enterprises

The New Frontier for Edge AI

AI Report for Enterprises

Arm AI Readiness Index

AI Innovation on Arm

Silicon Reimagined in the Age of AI

AI From Cloud to Edge

Why Software is Crucial to Achieving AI’s Full Potential

Generative AI on Arm

Scale Generative AI With Flexibility and Speed

AI on Mobile

Redefining Mobile Experiences With AI

Key Takeaways

Subscribe to the Latest AI News from Arm

Agentic AI

Bringing Enterprise-Ready AI With Arcee AI

Mobile AI

Unlocking AI Technologies with Meta

Mobile AI

Stability AI: Transforming On-Device Audio AI