What is a convolutional neural network (CNN)?

Why It’s Important

Efficient parameter usage: Weight sharing and sparse connectivity mean CNNs manage high-dimensional data with fewer parameters
State-oftheart in vision: They dominate deep learning approaches in computer vision and image processing, with growing applications in NLP, speech, and time-series data
Versatile across domains: Beyond image recognition, CNNs are used in video analysis, NLP, medical imaging, recommender systems, and more
Relevant for edge/IoT: Efficient CNN deployment on power sensitive Arm CPUs/NPUs (e.g., Cortex M) enables AI at the edge, such as real time image recognition

Convolution operation: Kernels traverse the input, computing dot products at each local receptive field; applied across multiple filters, they yield a stack of feature maps
Striding and padding: The stride defines the step size of filter movement, while zero padding ("valid", "same", or "full") adjusts output dimensions
Feature hierarchy: Early layers capture basic patterns (edges, textures); deeper layers detect complex structures (objects, parts)
Pooling: Reduces feature map resolution, improving computational efficiency and mitigating overfitting
Flattening + FC layers: Convert 2D feature maps into 1D vectors, enabling global context to drive classification, typically via softmax output

Models are trained via backpropagation and gradient descent to adjust convolutional filters and dense layer weights.

Convolutional layers (Conv): slide learnable filters (kernels) across input to generate feature maps. Filters share weights, reducing parameter count significantly (e.g., comparing 25 vs. 10,000 weights)
Pooling layers: down sample feature maps using operations like max or average pooling to reduce spatial size and overfitting
Activation functions: introduce nonlinearity, e.g., ReLU (for convolutional layers) or softmax (for classification tasks in fully-connected layers)
Fully-connected (FC) layers: flatten prior outputs and connect every neuron to perform final classification—or regression—tasks