Arm + PyTorch: Accelerating AI on Arm Everywhere from Cloud to Edge

This ground–breaking collaboration between Arm and the PyTorch team at Meta enables.
Together we are democratizing the AI innovation for developers – enabling them to seamlessly integrate the newest quantized models into their applications with no additional modifications or optimizations, saving time and resources.
Check out the ExecuTorch Beta release, optimized for Arm everywhere.

Faster PyTorch Inference on Arm in the Cloud
Arm, in collaboration with our partners, enhances PyTorch’s inference performance on Arm Neoverse servers.
- Developers automatically benefit: Arm integrates performance optimizations, libraries, and microkernels directly into the PyTorch framework.
- Expanding collaborations with cloud service providers: EnableS AI developers everywhere.
- Arm is enabling the entire ML stack and workflow: By collaborating with the entire ecosystem, from ML software companies like Databricks to the largest developer platform we enable like GitHub, we show developers exactly how to build AI workloads on Arm CPUs.
Learn about the Arm reference implementation of a Graviton optimized chatbot here.

Accelerating Generative AI at the Edge on Arm with ExecuTorch
The collaboration between Arm and the PyTorch team at Meta is making AI accessible to the broadest range of devices and developers.
- Arm compute platform and ExecuTorch framework: Enables smaller, optimized models for faster generative AI at the edge.
- New quantized Llama 3.2 models: Ideal for on-device and edge AI applications on Arm, providing reduced memory footprint and improved accuracy, performance and portability.
- Scale across edge devices: 20 million Arm developers create and deploy more intelligent AI-based applications quicker at scale across billions of edge devices.
Pytorch + Arm: Enabling LLM integration Everywhere
Here are some resources where you can learn more about the Pytorch and Arm Collaboration:
Pytorch Repositories and Developer Resources
Learning Path
- Run a Large Language Model (LLM) Chatbot with PyTorch Using KleidiAI on Arm Servers
- Build an Android Chat App with Llama, KleidiAI, ExecuTorch, and XNNPACK
- Optimize MLOps with Arm-Hosted GitHub Runners
- Run Llama 3 on a Raspberry Pi 5 Using ExecuTorch
- Run a Natural Language Processing (NLP) Model from Hugging Face on Arm Servers
- Accelerate Natural Language Processing (NLP) Models from Hugging Face on Arm Servers
- Create and Train a PyTorch Model for Digit Classification
- Use Keras Core with TensorFlow, PyTorch, and JAX Backends
Videos:
- AI Performance Boost with KleidiAI and PyTorch on AWS Graviton 4 Demo | PyTorch Conference 2024
- Generative AI Inference with AWS Graviton Processors | AWS AI Infrastructure Day 2024 on AWS OnAir
- Accelerating LLM Family of Models on Arm Neoverse Based Graviton AWS Processors with KleidiAI
- Edge AI Acceleration with KleidiAI and ExecuTorch Demo | PyTorch Conference 2024
- AI Tech Talk: Bringing PyTorch Models to Cortex-M
Links to Pytorch and Arm Projects:
Blogs:
- PyTorch Grows as the Dominant Open Source Framework for AI and ML: 2024 Year in Review
- Arm Newsroom: Accelerating Generative AI at the Edge on Arm with ExecuTorch Beta Release
- Unleashing the Power of AI on Mobile: LLM Inference for Llama 3.2 Quantized Models with ExecuTorch and KleidiAI
- Arm Accelerates AI From Cloud to Edge with New PyTorch and ExecuTorch Integrations to Deliver Immediate Performance Improvements for Developers
- Accelerated PyTorch Inference with torch.compile on AWS Graviton Processors
- Unleashing the Power of AI on Mobile: LLM Inference for Llama 3.2 Quantized Models with ExecuTorch and KleidiAI
- Streamlining MLOps on Arm
- IoT Software Development with GitHub Partnership
- Run Llama3-8b on a Raspberry Pi 5 with ExecuTorch
- Getting Started with PyTorch, ExecuTorch, and Ethos-U85 in Three Easy Steps
- Arm Joins the PyTorch Foundation as a Premier Member