LSTM (Long Short-Term Memory)

LSTM
Type Recurrent Neural Network
Introduced 1997 (Hochreiter & Schmidhuber)
Purpose Sequential/time-series data
Key Feature Gated memory cells
AM Uses Energy monitoring, process prediction

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture designed to learn long-term dependencies in sequential data. Unlike standard RNNs, LSTMs can remember information for long periods thanks to their gated memory cell structure, making them ideal for time-series analysis.

In additive manufacturing, LSTMs excel at analyzing time-series sensor data—temperature profiles, energy consumption patterns, and process signals—where understanding temporal context is crucial for prediction and classification.

Contents
  1. The Problem LSTMs Solve
  2. Architecture
  3. The Three Gates
  4. Applications in AM
  5. References

The Problem LSTMs Solve

Standard RNNs suffer from the vanishing gradient problem: when learning from long sequences, gradients become extremely small during backpropagation, making it impossible to learn long-term dependencies. For example, if predicting print quality depends on temperature settings from 1000 timesteps ago, standard RNNs fail.

LSTM solution: Gated memory cells allow gradients to flow unchanged through time (the "constant error carousel"), enabling learning of dependencies spanning hundreds or thousands of timesteps.

Architecture

                    ┌─────────────────────────────────────┐
                    │           LSTM Cell                 │
    ┌───────┐       │  ┌─────┐  ┌─────┐  ┌─────┐        │       ┌───────┐
    │ cₜ₋₁  │──────▶│──│  ×  │──│  +  │──│     │────────│──────▶│  cₜ   │
    └───────┘       │  └──▲──┘  └──▲──┘  └──┬──┘        │       └───────┘
                    │     │        │        │           │
                    │  Forget    Input    Output        │
                    │   Gate     Gate      Gate         │
    ┌───────┐       │     │        │        │           │       ┌───────┐
    │ hₜ₋₁  │──────▶│─────┴────────┴────────┴───────────│──────▶│  hₜ   │
    └───────┘       │                                   │       └───────┘
                    │           ▲                       │
    ┌───────┐       │           │                       │
    │  xₜ   │──────▶│───────────┘                       │
    └───────┘       └─────────────────────────────────────┘

            cₜ = cell state (long-term memory)
            hₜ = hidden state (short-term output)
            xₜ = input at time t
            

The Three Gates

Forget Gate

Decides what information to discard from the cell state. Outputs values between 0 (forget completely) and 1 (keep completely) for each element in the cell state.

Input Gate

Decides what new information to store. Has two parts: a sigmoid layer that decides which values to update, and a tanh layer that creates candidate values.

Output Gate

Decides what to output based on the cell state. The cell state is passed through tanh and multiplied by the output gate's sigmoid to produce the hidden state.

Applications in Additive Manufacturing

Energy Stage Classification (98.2% Accuracy):
Hassan et al. (2024) reports LSTM networks achieving 98.2% accuracy in classifying energy consumption stages (printing, standby, preheating) across PLA, ABS, and PETG materials. The temporal patterns in power consumption are distinctive for each stage.

Key Use Cases

See Also

References