Back to all questions

How do I tune hyperparameters for better optimization results?

Optimization
hyperparameters
tuning
configuration

MatCraft's defaults work well for most problems, but tuning key hyperparameters can improve convergence speed and solution quality. Here is a guide to the most impactful settings.

Surrogate Model Hyperparameters

Hidden Layers

yaml
surrogate:
  hidden_layers: [64, 64]  # Default
  • Small datasets (<50 points): Use [32, 32] or even [32] to avoid overfitting.
  • Large datasets (>200 points): Scale up to [128, 128] or [128, 64, 32].
  • High-dimensional inputs (>10 parameters): Add a wider first layer, e.g., [128, 64].

Learning Rate and Epochs

yaml
surrogate:
  learning_rate: 0.001  # Default. Decrease to 0.0005 for noisy data.
  epochs: 200           # Default. Increase to 500 for complex objectives.
  early_stopping_patience: 20  # Prevents overfitting

Ensemble Size

Using an ensemble of surrogates provides better uncertainty estimates for the acquisition function:

yaml
surrogate:
  ensemble_size: 5  # Train 5 independent MLPs (default: 1 with MC Dropout)

Ensembles are more expensive to train but significantly improve active learning performance on small datasets.

Optimizer Hyperparameters

CMA-ES Settings

yaml
optimizer:
  sigma0: 0.3           # Start broad; decrease to 0.1 if you have a good prior
  population_size: 20   # Increase for noisy objectives or high dimensions
  • sigma0: The initial step size relative to parameter ranges. Use 0.3 for exploratory searches, 0.1 when you have a rough idea of the optimal region.
  • population_size: Larger populations are more robust but slower. The default of 4 + floor(3 * ln(n_params)) is usually good. Double it for very noisy problems.

Acquisition Function

yaml
acquisition:
  type: expected_improvement  # Best default
  # type: upper_confidence_bound
  # exploration_weight: 2.0   # Higher = more exploration
  • EI is the best default for most problems.
  • UCB with a high exploration_weight (2.0-5.0) is useful when you suspect the design space has multiple local optima and want to explore more broadly.

Batch Size

yaml
campaign:
  batch_size: 5  # Candidates per iteration
  • Cheap evaluations (simulations, analytic models): Use batch size 1 for maximum sample efficiency.
  • Expensive evaluations (lab experiments): Use batch size 5-10 to maximize parallelism.
  • Larger batches reduce the number of iterations but may explore less efficiently per evaluation.

Quick Tuning Checklist

  1. Start with defaults and run a baseline campaign.
  2. Check the convergence plot. If the surrogate error is high, increase network size or ensemble count.
  3. If convergence is slow, increase sigma0 or switch to UCB with higher exploration weight.
  4. If the surrogate overfits (training error low, validation error high), reduce network size or increase early stopping patience.

Related Questions