Convolutional Neural Network (CNN)
A Convolutional Neural Network (CNN) is a type of deep learning model specifically designed to process data with a grid-like structure, such as images. CNNs automatically learn to detect features like edges, textures, and shapes through training, making them highly effective for visual recognition tasks.
In additive manufacturing, CNNs are used for defect detection (identifying flaws in printed parts from camera images), surface quality assessment, and topology optimization (predicting optimal material distributions).
How CNNs Work
Unlike traditional neural networks that treat input as a flat list of numbers, CNNs preserve the spatial structure of images. They work by sliding small filters (also called kernels) across the image to detect local patterns.
The key insight is that visual features like edges or corners look the same regardless of where they appear in an image. By using the same filter across the entire image, CNNs can recognize patterns in any location—a property called translation invariance.
Input Image Filter (3x3) Feature Map
┌───────────┐ ┌───────┐ ┌─────────┐
│ . . . . . │ │ 1 0 -1│ │ │
│ . █ █ . . │ * │ 1 0 -1│ = │ █ . . │
│ . █ █ . . │ │ 1 0 -1│ │ █ . . │
│ . . . . . │ └───────┘ │ │
└───────────┘ (edge detector) └─────────┘
A filter detects vertical edges by finding areas where pixel values change horizontally
Layer Types
Convolutional Layer
The core building block. Each convolutional layer applies multiple filters to the input, producing multiple feature maps. Early layers detect simple features (edges, colors); deeper layers combine these into complex patterns (shapes, objects).
Output(i,j) = Σ Σ Input(i+m, j+n) × Filter(m,n) + bias
The filter slides across the image, computing element-wise products and summing them
Pooling Layer
Reduces the spatial size of feature maps, making the network more efficient and less sensitive to small shifts. Max pooling takes the maximum value in each region; average pooling takes the mean.
Fully Connected Layer
After several convolution and pooling layers, the output is flattened and passed through one or more fully connected layers (like a traditional neural network) to produce the final prediction.
| Layer Type | Purpose | Output |
|---|---|---|
| Convolutional | Detect local features using filters | Feature maps |
| Activation (ReLU) | Add non-linearity | Same size, negative values → 0 |
| Pooling | Reduce spatial dimensions | Smaller feature maps |
| Fully Connected | Combine features for classification | Class probabilities |
Common Architectures
LeNet-5 (1998)
One of the first successful CNNs, designed by Yann LeCun for handwritten digit recognition. Simple architecture with two convolutional layers.
AlexNet (2012)
Won the ImageNet competition, sparking the deep learning revolution. Introduced ReLU activation and dropout regularization. 8 layers, 60 million parameters.
VGG (2014)
Showed that deeper networks with small (3×3) filters outperform shallower networks with larger filters. VGG-16 has 16 layers.
ResNet (2015)
Introduced "skip connections" that allow training very deep networks (50-152+ layers) without degradation. Winner of ImageNet 2015.
U-Net (2015)
Designed for image segmentation (labeling each pixel). Uses an encoder-decoder structure. Widely used in topology optimization for predicting material distributions.
Applications in Additive Manufacturing
Defect Detection
CNNs analyze images from cameras monitoring the print process. They can detect layer defects, warping, delamination, and surface irregularities in real-time. Studies report 86-90% accuracy for common defect types.
Topology Optimization
U-Net and similar architectures predict optimal material distributions from boundary conditions and loads. Instead of running iterative FEA simulations (hours), a trained CNN produces results in seconds with 99% accuracy.
Surface Quality Assessment
CNNs classify surface roughness levels from microscope images, enabling automated quality grading without manual measurement.
Training a CNN
Training a CNN involves:
- Data collection: Gather labeled images (e.g., defective vs. good parts)
- Data augmentation: Artificially expand dataset by rotating, flipping, scaling images
- Forward pass: Input flows through layers to produce prediction
- Loss calculation: Compare prediction to true label (e.g., cross-entropy loss)
- Backpropagation: Calculate gradients of loss with respect to all weights
- Weight update: Adjust weights using optimizer (e.g., Adam, SGD)
- Repeat: Process many batches over multiple epochs
See Also
- Machine Learning — Overview of ML concepts
- GAN — Generative Adversarial Networks
- VAE — Variational Autoencoders
- Design Optimization — CNNs for topology optimization
References
- LeCun, Y., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
- Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. NeurIPS.
- Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. MICCAI.
- Rade, J., et al. (2023). Deep learning-based 3D multigrid topology optimization. Engineering Applications of AI. DOI