Join this 1-hour live code-along and Q&A where you’ll build a scalable retrieval-augmented generation (RAG) pipeline using Arm-based cloud infrastructure. Learn how to integrate a vector database, run LLMs efficiently, and serve a working RAG application—all optimized for performance on Arm CPUs.
Please note we’ll provide access to sandbox environments for attendees.
Date: April 22, 2025
Time: 9 a.m. PT | 5 p.m. BST | 6 p.m. CET
Length: 45 minutes (code-along) + 15 minutes (Q&A)
What you’ll build:
- A complete RAG pipeline using Hugging Face embeddings and LangChain
- A vector store using FAISS
- An app served using Python and Flask, running on Arm-based instances
What you’ll learn:
- How RAG architecture works and why it improves LLM performance
- Efficient model serving on Arm using optimized runtimes
Who should join:
- Developers and ML engineers working with LLMs and GenAI tools
- Backend engineers building or deploying AI-driven apps
- Anyone interested in optimizing AI workloads for cost and scale on Arm CPUs
Connect With the Experts
One week after the code-along, join an open Q&A with Arm engineers and Arm ambassadors. Bring your implementation questions, share what you’ve built, and explore advanced use cases, architecture tuning, and tooling options.
Date: April 29, 2025
Time: 9 a.m. PT | 5 p.m. BST | 6 p.m. CET
Length: 50 minutes