Convolutional Neural Network (CNN)

CNN

Type Deep neural network

Introduced 1989 (LeCun)

Primary Use Image analysis

Key Layers Convolution, Pooling, Fully Connected

AM Uses Defect detection, topology optimization

A Convolutional Neural Network (CNN) is a type of deep learning model specifically designed to process data with a grid-like structure, such as images. CNNs automatically learn to detect features like edges, textures, and shapes through training, making them highly effective for visual recognition tasks.

In additive manufacturing, CNNs are used for defect detection (identifying flaws in printed parts from camera images), surface quality assessment, and topology optimization (predicting optimal material distributions).

Contents

How CNNs Work
Layer Types
Common Architectures
Applications in Additive Manufacturing
Training a CNN
References

How CNNs Work

Unlike traditional neural networks that treat input as a flat list of numbers, CNNs preserve the spatial structure of images. They work by sliding small filters (also called kernels) across the image to detect local patterns.

The key insight is that visual features like edges or corners look the same regardless of where they appear in an image. By using the same filter across the entire image, CNNs can recognize patterns in any location—a property called translation invariance.

Input Image     Filter (3x3)      Feature Map
┌───────────┐   ┌───────┐        ┌─────────┐
│ . . . . . │   │ 1 0 -1│        │         │
│ . █ █ . . │ * │ 1 0 -1│   =    │  █ . .  │
│ . █ █ . . │   │ 1 0 -1│        │  █ . .  │
│ . . . . . │   └───────┘        │         │
└───────────┘   (edge detector)  └─────────┘

A filter detects vertical edges by finding areas where pixel values change horizontally

Layer Types

Convolutional Layer

The core building block. Each convolutional layer applies multiple filters to the input, producing multiple feature maps. Early layers detect simple features (edges, colors); deeper layers combine these into complex patterns (shapes, objects).

Convolution operation:
Output(i,j) = Σ Σ Input(i+m, j+n) × Filter(m,n) + bias
The filter slides across the image, computing element-wise products and summing them

Pooling Layer

Reduces the spatial size of feature maps, making the network more efficient and less sensitive to small shifts. Max pooling takes the maximum value in each region; average pooling takes the mean.

Fully Connected Layer

After several convolution and pooling layers, the output is flattened and passed through one or more fully connected layers (like a traditional neural network) to produce the final prediction.

Layer Type	Purpose	Output
Convolutional	Detect local features using filters	Feature maps
Activation (ReLU)	Add non-linearity	Same size, negative values → 0
Pooling	Reduce spatial dimensions	Smaller feature maps
Fully Connected	Combine features for classification	Class probabilities

Common Architectures

LeNet-5 (1998)

One of the first successful CNNs, designed by Yann LeCun for handwritten digit recognition. Simple architecture with two convolutional layers.

AlexNet (2012)

Won the ImageNet competition, sparking the deep learning revolution. Introduced ReLU activation and dropout regularization. 8 layers, 60 million parameters.

VGG (2014)

Showed that deeper networks with small (3×3) filters outperform shallower networks with larger filters. VGG-16 has 16 layers.

ResNet (2015)

Introduced "skip connections" that allow training very deep networks (50-152+ layers) without degradation. Winner of ImageNet 2015.

U-Net (2015)

Designed for image segmentation (labeling each pixel). Uses an encoder-decoder structure. Widely used in topology optimization for predicting material distributions.

Applications in Additive Manufacturing

Defect Detection

CNNs analyze images from cameras monitoring the print process. They can detect layer defects, warping, delamination, and surface irregularities in real-time. Studies report 86-90% accuracy for common defect types.

See: Quality Control & Defect Detection

Topology Optimization

U-Net and similar architectures predict optimal material distributions from boundary conditions and loads. Instead of running iterative FEA simulations (hours), a trained CNN produces results in seconds with 99% accuracy.

See: Design & Geometry Optimization

Surface Quality Assessment

CNNs classify surface roughness levels from microscope images, enabling automated quality grading without manual measurement.

Example from Literature: Rade et al. (2023) used a Pyramid U-Net CNN for 3D topology optimization, achieving 99% accuracy in material distribution prediction with 4.77× training speedup using multigrid techniques. [DOI]

Training a CNN

Training a CNN involves:

Data collection: Gather labeled images (e.g., defective vs. good parts)
Data augmentation: Artificially expand dataset by rotating, flipping, scaling images
Forward pass: Input flows through layers to produce prediction
Loss calculation: Compare prediction to true label (e.g., cross-entropy loss)
Backpropagation: Calculate gradients of loss with respect to all weights
Weight update: Adjust weights using optimizer (e.g., Adam, SGD)
Repeat: Process many batches over multiple epochs

References

LeCun, Y., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. NeurIPS.
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. MICCAI.
Rade, J., et al. (2023). Deep learning-based 3D multigrid topology optimization. Engineering Applications of AI. DOI

wik.ai