Back to all questions

How do I import experimental data into MatCraft?

Getting Started
data
import
csv

MatCraft supports multiple data import formats and methods. Your data forms the foundation for surrogate model training, so getting it right is important.

Supported Formats

  • CSV (recommended for tabular data)
  • JSON (for structured or nested data)
  • Excel (.xlsx) via the dashboard upload
  • Programmatic via the Python SDK or REST API

CSV Import

The simplest approach is a CSV file with columns for each component and objective:

csv
polymer_concentration,additive_loading,crosslinker_ratio,water_flux,salt_rejection
0.15,0.05,0.03,45.2,92.1
0.20,0.08,0.05,38.7,95.3
0.25,0.10,0.04,32.1,97.0

Import via CLI:

bash
materia data import --material mem-001 --file measurements.csv

Python SDK

For programmatic import, especially when transforming data from other tools:

python
import pandas as pd
from materia import Material
from materia.io import import_data

df = pd.read_csv("lab_results.csv")
# Rename columns if needed
df = df.rename(columns={"flux_lmh": "water_flux", "rejection_pct": "salt_rejection"})

material = Material.from_yaml("my_material.yaml")
import_data(material, df)

Data Validation

MatCraft validates imported data against your material definition:

  • Bounds checking: Values outside component bounds are flagged as warnings (not rejected, since real measurements can exceed expected ranges).
  • Missing values: Rows with missing objective values are accepted but excluded from surrogate training. Missing component values cause the row to be rejected.
  • Duplicates: Exact duplicate rows are detected and deduplicated with a warning.
  • Type coercion: String values in numeric columns are automatically parsed where possible.

Best Practices

  • Start with at least 10 data points; 20-50 is ideal for 3-5 dimensional spaces.
  • Include points spread across the design space, not just near known optima.
  • If you have data from different experimental batches, include a batch column — MatCraft can account for batch effects in surrogate training.

Related Questions