The advancement of artificial intelligence (AI) has been a great enabler for the Internet of things (IoT). Given the ability to think for itself, it’s shrugged off its original definition as a network of tiny sensors and grown to incorporate a host of more intelligent AIoT (AI+IoT) devices, from smartphones all the way up to autonomous vehicles.
AI has also paved the way for new IoT device categories. Previously passive devices such as surveillance cameras are being transformed into highly valuable IoT video sensors, as cloud-based AI algorithms turn raw footage into structured data that’s ripe for inference.
However, the nature of systems like video sensors means they’re extremely data-heavy. Consider that the Internet today is volumetrically a streaming video service first and foremost: 77 per cent of the world’s downstream bandwidth is taken up with video data, and 15 per cent of that is Netflix streaming alone.
Yet with each video sensor generating potentially thousands of gigabytes of data per day, the downstream data tide is likely to turn as a tsunami of IoT data floods upstream. And as a result, the global network infrastructure that’s long been optimized for downstream distribution is becoming brittle.
In AI we trust
AI and its hunger for data might be the root cause of this new problem, but it’s also going to become a significant part of the solution by enabling intelligence throughout the infrastructure stack.
It’s likely the majority of AI heavy lifting will always be performed in the cloud due to the concentration of compute – especially when it comes to training machine learning (ML) algorithms on historic data. But when it comes to applying that training to real-time inference, decision systems live or die on how quickly decisions can be made. And when data has to travel thousands of miles to a datacenter, there’s no guarantee that by the time it has been received, computed and responded to it will still be useful.
Applications such as safety-critical automation, vehicle autonomy, medical imaging and manufacturing all demand a near-instant response to data that’s mere milliseconds old – and the latency introduced in asking the cloud to process that weight of data on a global scale would likely reduce its value to zero.
If time is the enemy of real-time inference, it makes sense that we move the intelligence closer to where the data is created. And you can’t get much closer than on the device itself.
Endpoint AI, aka AIoT
Due to their powerful internal hardware, smartphones have long been a fertile test-bed for endpoint AI. A smartphone camera is a prime example: it’s gone from something that takes grainy selfies to being secure enough for biometric authentication and powerful enough for computational photography – adding background blur (or a pair of bunny ears) to selfies in real time.
This technology is finding its way naturally into other IoT devices: Arm is already exploring how AIoT video sensors might employ on-device intelligence in identifying people within a room in private working spaces. That same technology is certainly powerful enough to enable an image sensor to infer ALPR (automated license plate recognition) data or plot a shopper’s journey around a retail store.
Yet endpoint AI also has its limitations. With the probable exception of autonomous vehicles, the highly compute-intensive training part of ML simply isn’t possible in endpoint devices due to processing and storage restrictions. And data collected by one endpoint device has limited value on its own, no matter how capable that device is at AI-powered analysis. The data still needs to be sent somewhere powerful enough to perform complex AI and with access to other data streams which can be combined to infer meaning.
Which brings us back to the cloud. Or does it? Where else in the network might be capable of the compute required to make complex decisions about data?
One step closer to the AI Edge
In what seems like the blink of an eye, basic devices such as network bridges and switches have given way to powerful edge servers that add datacenter-level hardware into the gateway between endpoint and cloud. If you need any more evidence of the drive to democratize compute you only need to look at the numbers of edge servers now being shipped.
And by happy coincidence (or careful advance planning, depending who you ask), those powerful new edge servers making their way into 5G base stations are plenty powerful enough to perform sophisticated AI processing – not only ML inference but training, too.
We’re calling this the AI Edge. With 3G or 4G it really wasn’t worth enabling a base station to think for itself because the latency was too great, but 5G reduces that latency to practically zero – and given the 5G specification for a far greater proliferation of base stations than in previous iterations of mobile infrastructure, in many cases edge server hardware sitting at the base of those cell towers will come almost within touching distance of the endpoint device. Latency becomes practically a non-issue for most applications.
At the edge, AI is set to play a dual role. At a network level, it could be used to analyze the flow of data for network prediction and network function management – intelligently distributing that data to wherever it makes most sense at the time, whether that’s the cloud or elsewhere.
But given edge servers are such capable intelligent thinkers, doesn’t it make sense to put them to work in training and inference of the data itself? The benefits enabling the AI edge to delve into, and make decisions over, data are abundantly clear; significantly reduced backhaul to the cloud, negligible latency and improved security, reliability and efficiency across the board.
There’s also a data locality interest in moving sensitive data no further than the edge: The more data we move to a centralized location, the more opportunities arise for that data’s integrity to be compromised. The IoT is already struggling to shake its ‘big brother’ image – a large-scale leak of personal video data is the last thing it needs in establishing consumer trust.
Of course, there’s even greater security benefit to that data never leaving the endpoint – but unlike endpoint devices, AI edge servers are not restricted to juggling their own data sets. The most successful AI edge algorithms will combine any number of external sources to create complex pictures of a process, environment or situation.
This is likely to be most effective when we combine layers of AI. An endpoint ALPR camera might possess the AI prowess to pick out license plate numbers from its own data in real time. It would then only need to send the inference of that data – the plates themselves – to the AI edge.
From here, an AI edge server might combine the license plate data from hundreds of other nearby ALPR cameras in a given area and compare it with a database of stolen vehicles in order to track a car as it moves across state, predict the vehicle’s trajectory and suggest roadblock locations to law enforcement.
And an AI edge server situated in one of a trucking company’s logistics hubs might combine pre-computed driver and image sensor data from traffic cameras with weather data from nearby weather stations in real time in order to amend routes and predict delivery times. All of that could happen on the AI edge server itself – with only the updated schedules uploaded to the cloud.
A greater opportunity than ever
What does this mean for the industry? Certainly, the days of companies limiting their IT processes to the hardware contained within four walls are numbered. IT professionals will find themselves much more involved in operational efficiency – and as heterogeneous compute becomes ubiquitous throughout infrastructure, the skill will be in identifying where it makes most sense to process data – handing off different roles to different layers of AI where necessary – in order to gain the kind of overall insight that drives real business transformation.
As we move towards a world of one trillion IoT devices, we’re facing an infrastructural and architectural challenge greater than ever before – and as such, the technology we need to answer this great opportunity is constantly evolving. Arm’s perspective is driven by the use cases that we are already helping to define and empower using Arm technology, and I look forward to sharing some of these success stories throughout 2020.