How do I tune hyperparameters for better optimization results?

Question

Accepted Answer

MatCraft's defaults work well for most problems, but tuning key hyperparameters can improve convergence speed and solution quality. Here is a guide to the most impactful settings.

## Surrogate Model Hyperparameters

### Hidden Layers

```yaml
surrogate:
  hidden_layers: [64, 64]  # Default
```

- **Small datasets (<50 points)**: Use `[32, 32]` or even `[32]` to avoid overfitting.
- **Large datasets (>200 points)**: Scale up to `[128, 128]` or `[128, 64, 32]`.
- **High-dimensional inputs (>10 parameters)**: Add a wider first layer, e.g., `[128, 64]`.

### Learning Rate and Epochs

```yaml
surrogate:
  learning_rate: 0.001  # Default. Decrease to 0.0005 for noisy data.
  epochs: 200           # Default. Increase to 500 for complex objectives.
  early_stopping_patience: 20  # Prevents overfitting
```

### Ensemble Size

Using an ensemble of surrogates provides better uncertainty estimates for the acquisition function:

```yaml
surrogate:
  ensemble_size: 5  # Train 5 independent MLPs (default: 1 with MC Dropout)
```

Ensembles are more expensive to train but significantly improve active learning performance on small datasets.

## Optimizer Hyperparameters

### CMA-ES Settings

```yaml
optimizer:
  sigma0: 0.3           # Start broad; decrease to 0.1 if you have a good prior
  population_size: 20   # Increase for noisy objectives or high dimensions
```

- **sigma0**: The initial step size relative to parameter ranges. Use 0.3 for exploratory searches, 0.1 when you have a rough idea of the optimal region.
- **population_size**: Larger populations are more robust but slower. The default of `4 + floor(3 * ln(n_params))` is usually good. Double it for very noisy problems.

## Acquisition Function

```yaml
acquisition:
  type: expected_improvement  # Best default
  # type: upper_confidence_bound
  # exploration_weight: 2.0   # Higher = more exploration
```

- **EI** is the best default for most problems.
- **UCB** with a high `exploration_weight` (2.0-5.0) is useful when you suspect the design space has multiple local optima and want to explore more broadly.

## Batch Size

```yaml
campaign:
  batch_size: 5  # Candidates per iteration
```

- **Cheap evaluations** (simulations, analytic models): Use batch size 1 for maximum sample efficiency.
- **Expensive evaluations** (lab experiments): Use batch size 5-10 to maximize parallelism.
- Larger batches reduce the number of iterations but may explore less efficiently per evaluation.

## Quick Tuning Checklist

1. Start with defaults and run a baseline campaign.
2. Check the convergence plot. If the surrogate error is high, increase network size or ensemble count.
3. If convergence is slow, increase `sigma0` or switch to UCB with higher exploration weight.
4. If the surrogate overfits (training error low, validation error high), reduce network size or increase early stopping patience.

How do I tune hyperparameters for better optimization results?

Surrogate Model Hyperparameters

Hidden Layers

Learning Rate and Epochs

Ensemble Size

Optimizer Hyperparameters

CMA-ES Settings

Acquisition Function

Batch Size

Quick Tuning Checklist

Related Questions

What is CMA-ES and why does MatCraft use it?

How do surrogate models work in MatCraft?

How does MatCraft determine when optimization has converged?