What is AI Inference?

AI Inference is achieved through an “inference engine” that applies logical rules to the knowledge base to evaluate and analyze new information. In the process of machine learning, there are two phases. First, is the training phase where intelligence is developed by recording, storing, and labeling information. If, for example, you're training a machine to identify cars, the machine-learning algorithm is fed with many images of different cars the machine can later refer to. Second, is the inference phase where the machine uses the intelligence gathered and stored in phase one to understand new data. In this phase, the machine can use inference to identify and categorize new images as “cars" despite having never seen them before. In more complex scenarios, this inference learning can be used to augment human decision making.

Why is AI Inference Important?

AI inference is the essential component of artificial intelligence. Without inference, a machine would not have the ability to learn. While machine learning can run on any type of processor, the specific computing capabilities required has become increasingly important. Whether the focus is on highly-complex computing, high performance, or high efficiency, computing architectures such as Arm CPUs, GPUs and NPUs—all tailored to meet specific workload requirements—are available. For maximum reliability, energy efficiency, privacy and to minimize latency, AI inference is increasingly being applied at the point and time the data is being sensed, captured or created—at the “edge of input.”


Arm experts discuss what is AI inference and how it turns AI training into actionable inference to augment human decision-making from the cloud to the edge and to endpoints.

Related Solutions and Resources