Leveraging Arm CPUs for outstanding AI Inference
There is no AI without ‘thinking’. Therefore, in all AI systems, an AI head node is needed alongside an accelerator. Accelerators crunch the math that drives AI models, but it is CPUs that underpin the systems that turn that compute into real-world value.
The shift towards inference, which is increasingly agent-based, is redefining how AI systems operate. Agentic workloads are persistent, always-on, and power-constrained for which power-efficient CPUs are well-suited – providing a central role in AI datacenters and at the edge.
Guide to Understanding CPU Inference
This comprehensive guide provides a deep dive into processing AI workloads on CPUs and the use cases for which this may be the practical choice. Explore the industries that are already benefiting from CPU inference and learn about real-world examples.
Which AI Workloads Run Best on the CPU?
Always-on or power-constrained inference
From on-device AI use cases, such as virtual assistants and real-time translation to workloads processed in the cloud like generating insights from data or personalised content recommendations.
General purpose system-level compute
Leading the orchestration and control of AI systems, including coordinating accelerators as as an AI head node in a heterogeneous system, synchronizing AI agents and managing memory and data movement.
Benefits of the Arm CPU for AI
Designed for power-efficient performance at scale
The rise of agentic AI increases demand for CPUs running continuously within tight power and cost constraints. Arm-based CPU chips deliver industry-leading performance per watt, enabling customers to scale core counts and deploy at scale with low TCO.
Built for the future of the AI datacenter
Arm is the only architecture enabling the world’s leading hyperscalers to custom build the CPU performance they need with lower integration risk and faster time to deployment. 50% of compute shipped to hyperscalers is now Arm-based.
Open, ecosystem-driven adoption
Arm’s approach is grounded in openness and choice, enabling a broad ecosystem to build differentiated silicon and systems on Arm CPUs. This openness lowers adoption friction and accelerates ecosystem readiness across cloud and edge environments.
Foundation for AI from Edge to Cloud
Two Decades of AI Architecture Innovation
Arm has a focus on fast-paced architectural innovation that prepares our vast ecosystem for ever-changing compute requirements and the future of AI. Arm has consistently and proactively evolved the AI capabilities of our CPUs over two decades with features such as Neon, Helium, Scalable Vector Extension (SVE), Scalable Matrix Extension 2 (SME2), and others. The latest Armv9 architecture features are driving increased compute performance, alongside reduced power consumption for AI workloads.
Arm’s partnerships with the leading AI frameworks and operating systems helps ensure fast and easy deployment for scaling AI workloads across Arm CPUs. We support key partners with techniques for optimizing new models using quantization and open-source software for AI acceleration, such as Arm Kleidi, used by frameworks and independent software vendors. Targeting accelerations at the AI framework level helps drive AI accelerations on Arm CPUs most broadly, across billions of AI inference installs for workloads at the edge, on mobile, and in the cloud. Without any extra optimization effort, application developers can expect best performance for their AI workloads by default on Arm CPUs, thanks to our work across gaming, computer vision, and language models.
Latest News and Resources
- NEWS and BLOGS
- Reports
- White Papers
- Webinar
AI Report for Enterprises
The New Frontier for Edge AI
Smaller models and accelerated compute involving real-time CPU inference are transforming AI at the edge.
AI Report for Enterprises
Arm AI Readiness Index
Our comprehensive analysis of global AI readiness reveals how business leaders across enterprises worldwide are adopting practical use cases for AI inference and turning challenges into opportunities.
AI Innovation on Arm
Silicon Reimagined in the Age of AI
How silicon is evolving to meet the demands of AI, while addressing power efficiency, security, and software challenges.
AI From Cloud to Edge
Why Software is Crucial to Achieving AI’s Full Potential
Discover why software is key to implementing AI and how to accelerate the development of performant and secure AI applications.
Generative AI on Arm
Scale Generative AI With Flexibility and Speed
The race to scale new generative AI capabilities is creating both opportunities for innovation and challenges. Learn how to beat these challenges to successfully deploy AI everywhere.
AI on Mobile
Redefining Mobile Experiences With AI
Watch this webinar to learn about the Arm platform for AI on smartphones and laptops, including the latest Armv9 CPUs and GPUs and the benefits of running AI on-device.