Make your apps run 6x faster on device with SME2 acceleration
AI Summary
Deliver smoother, more responsive user experiences without relying on the cloud. SME2 (Scalable Matrix Extension 2) is a set of compute instructions integrated into iOS and Android devices, giving users faster, more satisfying AI interactions—up to 6x faster—with zero development effort through Arm KleidiAI.
SME2 puts free performance wins on the table with no code changes needed
Works automatically when SME2 hardware is detected
Compatible with iPhone 16/17, vivo X300 series, with more devices coming in 2026
Integrated with major frameworks through KleidiAI
SME2 acceleration across frameworks and devices you already use
SME2 support is already integrated into leading AI frameworks through KleidiAI. Just update to the latest version and confirm SME2 is enabled. No rewrites, custom kernels, or complexity.
Arm has delivered free performance gains across Android and iOS devices, instantly improving the everyday user experience.
- iPhone 16
- iPhone 17
- M4-based devices
- Vivo x300 series
- More devices coming throughout 2026
- XNNPACK
- ExecuTorch
- ONNX RT
- Llama.cpp
- Alibaba MNN
- Alipay xNN
- Angel
- MediaPipe
- OpenCV
- Up to 6x faster AI responses and real-time app experiences
- 3x vision and audio processing improvement
- 5x AI performance uplift with Arm Lumex devices
- Reduce latency to increase time spent on app and competitive advantage
Real-world use cases accelerated by SME2
Studio-quality beats in 7 seconds
Creators, musicians, DJs, sound designers, and audio enthusiasts can generate effects and music in seconds. SME2 makes real-time audio gen accessible, efficient, and fast on mainstream devices.
- Generate 10 seconds of audio in about 7 seconds
- Reduce inference time from 15.3s to 6.6s
- Decrease in model size from 5.2GB to 2.9GB
- Drop peak runtime RAM usage from 6.5GB to 3.6GB
AI performance is optimized on Arm
Standards-based AI optimization with Arm KleidiAI
KleidiAI is Arm’s standards-driven operator-optimization library, designed to accelerate AI workloads automatically. When SME2-capable hardware is detected, it routes matrix-heavy kernels through SME2 for significant speedups, without code modifications.
Integrated already into major frameworks, KleidiAI delivers optimized implementations for Arm’s newest instruction sets. This improves inference latency and efficiency while keeping models portable across the global Arm device ecosystem.
Efficient Arm CPU platform built for AI performance
The newest Arm C1 CPU architecture combines SME2 matrix compute with KleidiAI operator-level optimization to deliver up to 6x faster and 2x more efficient AI inference. Running models directly on the CPU gives developers consistent, low-latency performance without depending on or saturating GPU and NPU pipelines.
The result is real-time responsiveness and fluid user experiences across the full range of Arm-based devices.
FAQs
Do I need to change my code to use SME2?
No, KleidiAI does the work for you. Just update your existing AI framework and SME2 acceleration activates automatically on supported devices.
How do I know if SME2 is enabled?
Run your model and check your framework logs. If SME2 hardware is detected, KleidiAI routes matrix-heavy operations to it automatically for faster inference and lower latency.
Which devices support SME2?
SME2 is already live on the latest Arm v9.3+ CPUs, including iPhone 16 and 17, Vivo X300, with more Arm-based devices rolling out through 2026.
What kind of performance gains can I expect?
Depending on workload, SME2 delivers up to 6x faster inference, 3x better vision and audio throughput, and 2x higher energy efficiency — all without a single line of code change.
How do I get started?
Update your framework, confirm SME2 is active, and explore KleidiAI quick-start guides and benchmarks on developer.arm.com. You’ll see the speed difference instantly.
つながる
最新ニュース、ケーススタディ、知見を常に把握できるようサブスクリプションに登録してください。