MatCraft uses Celery with Redis as the message broker to run optimization tasks asynchronously. This architecture keeps the API responsive while computationally intensive surrogate training and optimization run in the background.
Materials optimization involves CPU-intensive tasks that can take seconds to minutes per iteration:
Running these in the API request-response cycle would cause timeouts. Celery offloads them to dedicated worker processes.
API Server Redis Celery Worker
│ │ │
│ POST /campaigns/run │ │
│──────────────────────────>│ │
│ task_id = "abc-123" │ │
│<──────────────────────────│ │
│ │ pick up task │
│ │──────────────────────────>│
│ │ │ Train surrogate
│ │ publish progress event │
│ │<──────────────────────────│
│ WebSocket push to client │ │
│<──────────────────────────│ │
│ │ │ Run CMA-ES
│ │ publish progress event │
│ │<──────────────────────────│
│ WebSocket push to client │ │
│<──────────────────────────│ │
│ │ task complete │
│ │<──────────────────────────│Worker configuration in your .env or Celery config:
# celery_config.py
broker_url = "redis://localhost:6379/0"
result_backend = "redis://localhost:6379/1"
task_serializer = "json"
result_serializer = "json"
accept_content = ["json"]
task_time_limit = 3600 # Hard limit: 1 hour per task
task_soft_time_limit = 3000 # Soft limit: 50 minutes (raises exception)
worker_prefetch_multiplier = 1 # One task at a time per worker process
worker_concurrency = 4 # 4 parallel worker processes# Single worker with 4 processes
celery -A materia.tasks worker --loglevel=info --concurrency=4
# GPU worker for surrogate training (single process, GPU-bound)
celery -A materia.tasks worker --loglevel=info --concurrency=1 \
--queues=gpu_training -n gpu-worker@%h
# Priority queues
celery -A materia.tasks worker --loglevel=info \
--queues=high_priority,default,low_priority| Task | Queue | Typical Duration | Priority | |———|———-|————————-|—————| | train_surrogate | default (or gpu_training) | 1-30s | High | | run_cmaes_optimization | default | 1-10s | High | | evaluate_candidates | default | 0.1s - 1h | Medium | | compute_pareto_front | default | 0.1-5s | Low | | export_results | low_priority | 1-60s | Low | | cleanup_old_checkpoints | low_priority | 1-10s | Low |
Use Flower for a web-based Celery monitoring dashboard:
pip install flower
celery -A materia.tasks flower --port=5555This provides real-time visibility into task queues, worker status, task history, and performance metrics at http://localhost:5555.
Add more workers to handle higher campaign throughput:
# Run 3 worker instances
docker compose up -d --scale worker=3Each worker independently pulls tasks from Redis, so scaling is straightforward. The only shared state is the PostgreSQL database.