Enabling Generative AI at Scale

The explosion in generative AI is only just beginning. Boston Consulting Group predicts AI will drive an estimated three-times energy increase with generative AI alone expecting to account for 1% of this, challenging today’s electrical grids. Meanwhile, large language models (LLMs) will become more efficient over time and inference deployed at the edge at scale is expected to increase exponentially. This growth has already started, and to face the challenges ahead, the technology ecosystem is deploying generative AI on Arm.

The Future of Generative AI is Built on Arm

Efficient Code Generation Enabled by Small Language Models (SLMs)

Small language models (SLMs) offer tailored AI solutions with reduced costs, increased accessibility, and efficiency. They are easy to customize and control, making them ideal for a range of applications, such as content and code generation.

Best-in-Class Text Generation on Arm-Based AWS Graviton3 CPUs

Server CPUs, such as the Arm Neoverse-based Graviton processors, provide a performant, cost-effective and flexible option for developers looking to deploy smaller, more focused LLMs in their generative AI applications.

Advancing Image and Video Generation with Multimodal LLMs

Transformer models enable AI to become multimodal, processing inputs like speech, images, and text. They are adaptable to tasks beyond image classification and object detection, enabling new and innovative vision capabilities.

Use Cases

Generative AI on Smartphones

 
 
 
Read Blog

Generative AI Starts with the CPU

Inference on Arm CPUs

Arm technology offers an efficient foundation for AI acceleration at scale, which enables generative AI to run on phones, PCs, and in datacenters. This is the result of two decades of architectural innovation in vector and matrix processing on our CPU architecture.

These investments in innovation have helped improve accelerated-AI compute and provide security that helps protect valuable models and enable low-friction deployment for developers.

Explore GenAI on CPU

Heterogeneous Solutions for GenAI Inference

For generative AI to scale at pace, we must ensure that AI is considered at the platform level, enabling all computation workloads.

Learn more about our leading AI compute platform, which includes our portfolio of CPUs and accelerators, such as GPUs and NPUs.

Explore AI Technologies

Software Collaboration Key for GenAI Innovation

Arm is engaged in several strategic partnerships to fuel AI-based experiences, while providing extensive software libraries and tools, and working on integration with all major operating systems and AI frameworks. Our goal is to help ensure developers can optimize without wasting valuable resources.

Arm Kleidi Libraries logo

Seamless Acceleration for AI Workloads

Discover more about how Arm ensures seamless acceleration for every developer, every model, and every workload. Arm Kleidi makes CPU inference accessible and easy, even for the most demanding generative AI workloads.

Arm and Hugging Face logo

Run Generative AI Efficiently on Arm

Want advice on running GenAI-enhanced workloads efficiently on Arm? These resources on Hugging Face help you build, deploy, and accelerate faster across a range of models, including large and small language models and models for natural language processing (NLP).

Explore AI Software

Subscribe to the Latest AI News from Arm

Newsletter Signup