Skip to content

multiverse

Reproducible benchmarking for multimodal single-cell integration.

Multiverse is an MLOps platform for academic single-cell integration studies. It sits between two notebook sessions: the one in which you curate a dataset and the one in which you interpret the resulting embeddings, and replaces the brittle scripts in between with a registry, a containerized runner, and a Streamlit interface. The goal is to make a benchmark easier to run, and the resulting Methods protocol easier to write.

Two Audiences, One Platform

The documentation is organized around two distinct audiences. Some pages are useful to both; where that is the case, terminology is introduced gently.

For Bioinformaticians

You work in AnnData, MuData, Scanpy, and Jupyter. You want to compare integration models on your data without becoming a Docker user.

Start at Getting Started for an end-to-end walkthrough, then consult Data Preparation for recipes. The Models Glossary and Evaluation Metrics reference pages describe what each model assumes and what each metric measures.

For MLOps and Platform Engineers

You will be running, deploying, or extending the platform. You may have no biological background, and that is fine! The system is a typed Python application built around a SQLite registry, a Docker-based runner, MLflow, Optuna, and a Streamlit front end.

Start at Architecture for the system map, then read Runner & Orchestration for the execution model and Model Container Contract for the I/O boundary. Adding a Model and the Developer Guide cover extension work.

What Multiverse Does

Researcher concern Platform responsibility
Biological question and dataset curation Dataset registration, omics-compatibility checks
Batch and cell-type metadata Metric eligibility gating and clear warnings
Model choice and hyperparameters Container-isolated, parallel execution with seed enforcement
Comparing embeddings and metrics Results tables, artifacts, MLflow tracking, Optuna sweeps
Writing a reproducible Methods section run_manifest.yaml, job_spec.json, metrics, logs, provenance

Quick Start

Prerequisites: Python 3.12+, uv, Docker with Compose v2.

make bootstrap      # install dev deps, create SQLite registry, register built-in models
make services-up    # optional: start MLflow (:25000) and Optuna Dashboard (:28080)
make setup          # optional: install GUI/local-runner extras
make gui            # launch the Streamlit GUI (:28501)

Open http://localhost:28501 and follow the Getting Started tutorial. For headless use, run uv run multiverse --help and uv run multiverse run --manifest run_manifest.yaml --output store/artifacts/run_output.

Documentation Map (Diátaxis)

Type Page Audience
Tutorial Getting Started Bio
How-to Data Preparation Bio
How-to Data Registration Bio
How-to Benchmarking Bio
How-to Adding a Model Ops
Reference Models Glossary Bio
Reference Evaluation Metrics Bio
Reference GUI Bio / Ops
Reference Run Manifest Bio / Ops
Reference Model Container Contract Ops
Reference Model Registration Ops
Reference Runner Ops
Explanation Architecture Ops
Explanation Observability Ops
Explanation Developer Guide Ops
Process Contributing All

License

Distributed under the MIT License. See LICENSE for details.