Build an Android Chat App with Llama, Arm KleidiAI, ExecuTorch, and XNNPACK

Sign up using the form to access the on-demand video.

In this code-along, we’ll be building a ChatAI application for Android, optimizing and deploying a local large language model (LLM) directly onto a mobile device. We’ll be using the latest AI technologies, including ExecuTorch (a PyTorch framework for running AI models on edge devices), XNNPACK (a floating-point neural network library optimized for Arm), KleidiAI (Arm-optimized kernels for neural network operations), and the Llama 3.2 1B Instruct model.

You’ll learn:

  • How to set up an ExecuTorch development environment
  • How KleidiAI kernels increase neural networks performance
  • About quantizing LLMs to boost inference speeds
  • How to build and deploy an Android application with a local LLM and inference framework

Register for the session using the form — or explore the learning path at your own pace to start building an Android Chat App today, the same workflow we’ll cover in the recording.

Loading...

Host

Principal SW Engineer – Developer Evangelist

@mhall119 on Discord

Michael is a technology advocate and AI innovator at Arm, dedicated to empowering developers and advancing open ecosystems. He leads efforts to make machine learning tools and frameworks more accessible and efficient across platforms.