multiverse¶

Reproducible benchmarking for multimodal single-cell integration.

Multiverse is an MLOps platform for academic single-cell integration studies. It sits between two notebook sessions: the one in which you curate a dataset and the one in which you interpret the resulting embeddings, and replaces the brittle scripts in between with a registry, a containerized runner, and a Streamlit interface. The goal is to make a benchmark easier to run, and the resulting Methods protocol easier to write.

Two Audiences, One Platform¶

The documentation is organized around two distinct audiences. Some pages are useful to both; where that is the case, terminology is introduced gently.

For Bioinformaticians¶

You work in AnnData, MuData, Scanpy, and Jupyter. You want to compare integration models on your data without becoming a Docker user.

Start at Getting Started for an end-to-end walkthrough, then consult Data Preparation for recipes. The Models Glossary and Evaluation Metrics reference pages describe what each model assumes and what each metric measures.

For MLOps and Platform Engineers¶

You will be running, deploying, or extending the platform. You may have no biological background, and that is fine! The system is a typed Python application built around a SQLite registry, a Docker-based runner, MLflow, Optuna, and a Streamlit front end.

Start at Architecture for the system map, then read Runner & Orchestration for the execution model and Model Container Contract for the I/O boundary. Adding a Model and the Developer Guide cover extension work.

What Multiverse Does¶

Researcher concern	Platform responsibility
Biological question and dataset curation	Dataset registration, omics-compatibility checks
Batch and cell-type metadata	Metric eligibility gating and clear warnings
Model choice and hyperparameters	Container-isolated, parallel execution with seed enforcement
Comparing embeddings and metrics	Results tables, artifacts, MLflow tracking, Optuna sweeps
Writing a reproducible Methods section	`run_manifest.yaml`, `job_spec.json`, metrics, logs, provenance

Quick Start¶

Prerequisites: Python 3.12+, uv, Docker with Compose v2.

make bootstrap      # install dev deps, create SQLite registry, register built-in models
make services-up    # optional: start MLflow (:25000) and Optuna Dashboard (:28080)
make setup          # optional: install GUI/local-runner extras
make gui            # launch the Streamlit GUI (:28501)

Open http://localhost:28501 and follow the Getting Started tutorial. For headless use, run uv run multiverse --help and uv run multiverse run --manifest run_manifest.yaml --output store/artifacts/run_output.

Documentation Map (Diátaxis)¶

Type	Page	Audience
Tutorial	Getting Started	Bio
How-to	Data Preparation	Bio
How-to	Data Registration	Bio
How-to	Benchmarking	Bio
How-to	Adding a Model	Ops
Reference	Models Glossary	Bio
Reference	Evaluation Metrics	Bio
Reference	GUI	Bio / Ops
Reference	Run Manifest	Bio / Ops
Reference	Model Container Contract	Ops
Reference	Model Registration	Ops
Reference	Runner	Ops
Explanation	Architecture	Ops
Explanation	Observability	Ops
Explanation	Developer Guide	Ops
Process	Contributing	All

License¶

Distributed under the MIT License. See LICENSE for details.