SVM (Support Vector Machine)

Support Vector Machine

Type Supervised learning algorithm

Introduced 1992 (Boser, Guyon, Vapnik)

Purpose Classification, Regression

Key Feature Maximum margin hyperplane

AM Uses Defect classification, quality prediction

Support Vector Machine (SVM) is a powerful supervised learning algorithm that finds the optimal hyperplane to separate different classes of data. SVMs are particularly effective in high-dimensional spaces and when the number of dimensions exceeds the number of samples.

In additive manufacturing, SVMs are used for defect classification, process monitoring, and quality prediction. They excel when the decision boundary between good and defective parts is complex but can be captured with the right kernel function.

Contents

Core Concept
Maximum Margin
Kernel Trick
Support Vector Regression
Applications in AM
References

Core Concept

SVM works by finding a hyperplane (a line in 2D, a plane in 3D, or a hyperplane in higher dimensions) that best separates data points of different classes. The "best" hyperplane is the one with the maximum margin—the largest distance to the nearest data points from each class.

    Class A (○)                    Class B (●)
         ○                              ●
           ○    ○                  ●  ●
              ○ ○ ○            ●  ●
                 ○|          |●
                  |  margin  |
    ─────────────────────────────────────────
                  |          |
                  ◀──────────▶
              Support Vectors
              (points on margin)

Support vectors: Only the data points closest to the hyperplane (the "support vectors") determine its position. This makes SVMs memory-efficient and robust to outliers far from the boundary.

Maximum Margin

The margin is the distance between the hyperplane and the nearest data points. SVM maximizes this margin, which provides:

Better generalization: Larger margins tend to produce lower test error
Robustness: Small perturbations in data are less likely to cross the boundary
Unique solution: The optimization problem is convex with a global optimum

Soft Margin SVM

When data isn't perfectly separable, soft margin SVM allows some misclassifications. The regularization parameter C controls the trade-off:

Large C: Stricter margin, fewer misclassifications, risk of overfitting
Small C: Wider margin, more misclassifications allowed, better generalization

Kernel Trick

When data isn't linearly separable, the kernel trick maps data to a higher-dimensional space where a linear separator exists—without explicitly computing the transformation.

Linear Kernel

K(x, y) = x · y

Best for linearly separable data or high-dimensional sparse data (like text).

RBF (Radial Basis Function) Kernel

K(x, y) = exp(-γ||x - y||²)

Most popular kernel. Maps to infinite-dimensional space. Parameter γ controls the "reach" of each training point.

Polynomial Kernel

K(x, y) = (γ·x·y + r)^d

Captures feature interactions up to degree d. Good for problems with known polynomial relationships.

Example: Defect Detection
With temperature and pressure as inputs, defects may not be linearly separable. An RBF kernel can find a circular or irregular boundary in the original 2D space by mapping to higher dimensions.

Support Vector Regression (SVR)

SVR adapts the SVM concept for regression. Instead of maximizing the margin between classes, SVR fits a tube of width ε around the data:

Points inside the tube have zero loss
Points outside are penalized linearly by distance
The tube width ε controls the trade-off between accuracy and model complexity

Parameter	Effect	Tuning Tip
C	Regularization strength	Grid search over [0.1, 1, 10, 100]
γ (RBF)	Kernel width	Grid search; often 1/n_features
ε (SVR)	Tube width for regression	Depends on noise level in data

Applications in Additive Manufacturing

Defect Classification:
SVMs classify parts as defective or acceptable based on process parameters. Hassan et al. (2024) notes SVM as a baseline classifier for quality control, though often outperformed by ensemble methods on complex datasets.

Common Use Cases

Binary classification: Good/bad part quality from sensor readings
Multi-class classification: Defect type identification (porosity, warping, delamination)
Regression (SVR): Predicting tensile strength, surface roughness
Anomaly detection: One-class SVM for detecting unusual process conditions

Advantages for AM

Works with small datasets: Common in AM where experiments are expensive
Effective in high dimensions: Many process parameters don't overwhelm SVM
Robust to overfitting: Margin maximization provides regularization

Limitations

Scaling required: SVM is sensitive to feature scales—normalize inputs
Kernel selection: Choosing the right kernel requires experimentation
Not probabilistic: Doesn't naturally output confidence scores
Slower on large datasets: Training time scales poorly with sample size

References

Cortes, C. & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297.
Burges, C.J.C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121-167.
Hassan, M., et al. (2024). A review of AI for optimization of 3D printing. Composites Part C. DOI

wik.ai