Embedding AI Voice Control for Automobiles and More
Sensory Inc. is a leading provider of embedded AI technologies, including voice control in automobiles and consumer electronic products. The company recently demonstrated its TrulyNatural and TrulyHandsfree speech recognition and natural language understanding (NLU) platforms running on the Arm Cortex-M55 processor and Arm Ethos-U55 neural processing unit (NPU). This enables advanced wake words, voice-control, and NLU capabilities on low-power microcontrollers designed for consumer appliances.
Complex deep learning models run quickly and efficiently for speech recognition.
Memory footprint of a microwave language model reduced from 17 MB to 1.6 MB.
Customizable, accurate speech recognition, while adhering to consumer devices constraints.
Creating an Embedded Voice Assistant
Sensory used its VoiceHub tool to rapidly build a compact wake word and large vocabulary language model tailored for appliance voice control. The TrulyNatural speech recognition and NLU engine were optimized to run efficiently on the Alif Semiconductor Ensemble E3 SoC, which is powered by Arm Cortex-M55 and Arm Ethos-U55. All data stays on device, helping to ensure absolute data privacy for system users . The model can run effectively on the Cortex-M55 alone, but Sensory wanted to increase efficiency by using it with the Ethos-U55. Key optimizations included:
- Partitioning workloads across the Cortex-M55 and Ethos-U55 to maximize throughput. Speech recognition runs primarily on the Ethos-U55, with the Cortex-M55 handling text processing and intent determination.
- Accelerating machine learning functions within the Cortex-M55 with Arm Helium to reduce power consumption for always-on wake word detection.
- Optimizing memory usage and network architectures to minimize bandwidth needs for the Cortex-M55 core.
- Reducing model size using Ethos-U55 native support for INT8 quantization and sparsity. This was critical for fitting into the memory-constrained environment of a microcontroller
These software optimizations allowed the advanced TrulyNatural engine to run smoothly on the Arm Cortex-M55 and Ethos-U55 accelerator on Alif’s Ensemble E3 platform. The solution provides customizable, accurate speech recognition and understanding, while adhering to the tight performance constraints of consumer devices.
Discover the Real-World Impact of Edge AI
Explore in-depth use cases that show how edge AI is powering the next generation of IoT, solving real-world problems, driving faster decisions, and lowering costs through smarter operations right at the device level.
- Sensory delivers embedded voice assistants using Arm-based processors for offline, private control.
- Arm Cortex-M55 and Ethos-U55 powered speech models with no need for cloud connectivity.
- Arm quantization and Helium optimizations reduced model size from 17 MB to 1.6 MB.
- Sensory used Arm tools to partition workloads and improve speed, power, and responsiveness.
- Arm enabled Sensory to deliver fast, private voice control within microcontroller constraints.