How do surrogate models work in MatCraft?

Question

Accepted Answer

Surrogate models are lightweight machine learning models that approximate expensive-to-evaluate functions. In MatCraft, they replace costly physics simulations or physical experiments during the optimization loop, enabling the optimizer to evaluate thousands of candidate compositions in seconds.

## The Role of Surrogates

In a typical materials optimization pipeline without surrogates, each evaluation might require:

- A DFT calculation (hours to days per composition)
- A molecular dynamics simulation (minutes to hours)
- A physical experiment (days to weeks)

A trained surrogate model can approximate these evaluations in **milliseconds**, enabling the CMA-ES optimizer to explore the design space efficiently.

## MLP Surrogate (Default)

MatCraft's default surrogate is a multi-layer perceptron (MLP) neural network:

```python
from materia.surrogate import MLPSurrogate

surrogate = MLPSurrogate(
    hidden_layers=[64, 64],   # Two hidden layers with 64 neurons each
    activation="relu",         # ReLU activation function
    learning_rate=0.001,       # Adam optimizer learning rate
    epochs=200,                # Training epochs
    validation_split=0.2,      # 20% held out for validation
    early_stopping_patience=20 # Stop if validation loss plateaus
)
```

The MLP takes component values as input and predicts objective values as output. For multi-objective problems, a separate output head is trained for each objective.

## Training Pipeline

1. **Normalization**: Input features are standardized to zero mean and unit variance. Objectives are min-max scaled.
2. **Training**: The MLP is trained using the Adam optimizer with mean squared error loss. Early stopping prevents overfitting on small datasets.
3. **Uncertainty estimation**: MatCraft uses MC Dropout or an ensemble of models to estimate prediction uncertainty. This uncertainty is critical for the active learning acquisition function.
4. **Retraining**: After each active learning iteration adds new data, the surrogate is retrained from scratch (not fine-tuned) to avoid catastrophic forgetting.

## When Surrogates Struggle

- **Very small datasets** (<10 points): The surrogate may not have enough data to learn meaningful patterns. Consider starting with a space-filling design (Latin Hypercube).
- **Highly discontinuous objectives**: MLPs assume some smoothness. If your objective has sharp phase transitions, increase network depth or consider a Gaussian Process surrogate.
- **Extrapolation**: Surrogates are less reliable outside the range of training data. MatCraft's acquisition function accounts for this by penalizing high-uncertainty regions.

How do surrogate models work in MatCraft?

The Role of Surrogates

MLP Surrogate (Default)

Training Pipeline

When Surrogates Struggle

Related Questions

What is CMA-ES and why does MatCraft use it?

What is active learning and how does MatCraft use it?