Code-Along and Q&A: Build and Run RAG Pipelines on Arm-Based Infrastructure

Join this 1-hour live code-along and Q&A where you’ll build a scalable retrieval-augmented generation (RAG) pipeline using Arm-based cloud infrastructure. Learn how to integrate a vector database, run LLMs efficiently, and serve a working RAG application—all optimized for performance on Arm CPUs.

Please note we’ll provide access to sandbox environments for attendees.

Date: April 22, 2025
Time: 9 a.m. PT | 5 p.m. BST | 6 p.m. CET
Length: 45 minutes (code-along) + 15 minutes (Q&A)

What you’ll build:

  • A complete RAG pipeline using Hugging Face embeddings and LangChain
  • A vector store using FAISS
  • An app served using Python and Flask, running on Arm-based instances

What you’ll learn:

  • How RAG architecture works and why it improves LLM performance
  • Efficient model serving on Arm using optimized runtimes

Who should join:

  • Developers and ML engineers working with LLMs and GenAI tools
  • Backend engineers building or deploying AI-driven apps
  • Anyone interested in optimizing AI workloads for cost and scale on Arm CPUs

Connect With the Experts

One week after the code-along, join an open Q&A with Arm engineers and Arm ambassadors. Bring your implementation questions, share what you’ve built, and explore advanced use cases, architecture tuning, and tooling options.

Date: April 29, 2025
Time: 9 a.m. PT | 5 p.m. BST | 6 p.m. CET
Length: 50 minutes

Loading...