Skip to content

Streamlit GUI

The Streamlit app at multiverse/gui.py is the primary researcher interface. It exposes registration, benchmark planning, execution, results browsing, and analysis dashboards.

Layout

Tab Query parameter Purpose
Registry ?tab=registry Inspect, register, and refresh datasets and models.
Configure ?tab=configure Pick dataset x model pairs and set hyperparameters.
Run ?tab=run Submit and monitor benchmark execution.
Results ?tab=results Browse completed runs, metrics, logs, and artifacts.
Analysis ?tab=analysis Embedded MLflow and Optuna views for cross-run analysis.

Registry

The Registry tab creates and refreshes dataset/model rows in the local SQLite index. Registration paths are hardened: path escapes are rejected, and elevated Docker options in model manifests require explicit opt-in.

Configure

Configure builds run_manifest.yaml from selected compatible dataset/model pairs. The compatibility matrix is computed from registered dataset omics and model requirements. Hyperparameter forms are rendered from model JSON schemas.

Run

The Run tab uses the in-process mvd controller. It does not spawn multiverse.runner.cli as a subprocess and does not own Docker containers directly.

  • Launch Run parses the manifest, submits jobs through the kernel/client boundary, and records mvd attempt IDs.
  • Cancel Run calls the kernel cancellation verb; it does not kill a local process handle.
  • The status table renders kernel states such as RUNNING, PROMOTING, ARTIFACT_SUCCESS, FAILED, and CANCELLED.
  • The event panel shows state transitions observed from kernel queries. Container logs remain artifact files after the run produces or preserves a workspace.
  • Evaluate Experiment resolves the latest launch cohort, filters members to readiness ready, builds or reuses the multiverse-evaluate Docker image, and runs evaluation in that container. The GUI remains a thin host process and does not import muon, scanpy, or scib-metrics.

Evaluation writes launch-scoped artifacts under <output-dir>/.multiverse/launches/<launch_id>/: eval_config.json, one evaluations/<member_id>.json file per evaluated member, evaluation_report.json, and scIB plots under plots/dataset_<dataset_slug>/. The GUI rebuilds the report from the full cohort plus live readiness so not-ready members appear in the same comparison table as evaluated members.

Results

Results reads from SQLite for fast listing and from artifact directories for durable run evidence. A successful run is trustworthy only when its artifact manifest and checksum sidecar verify.

Analysis

MLflow and Optuna are comparison/projection surfaces. They are not the source of scientific truth. If MLflow is offline, a run can still reach ARTIFACT_SUCCESS; sync can be retried later with multiverse mlflow-sync.

Common Issues

Symptom Likely cause What to do
Dataset missing from Configure Registry cache stale. Open Registry and refresh.
Launch fails before container start Manifest references stale rows or Docker is unavailable. Regenerate the manifest or run multiverse doctor.
Run is FAILED Container exit, validation refusal, or Docker error. Inspect the event panel, failure_reason, and preserved logs.
Evaluation button is disabled No cohort member has readiness ready. Wait for ARTIFACT_SUCCESS, inspect readiness reasons, or fix missing artifacts/datasets.
Evaluation row is evaluation_failed Dataset preprocessing or scIB failed for that member. Open .multiverse/launches/<launch_id>/evaluations/<member_id>.json for the structured reason.
MLflow panel is empty Projection service is offline. Start services or sync later; artifact bundles remain authoritative.
SQLite listing looks stale Index drift. Run multiverse rebuild-index.

Telemetry

multiverse/gui_telemetry.py records anonymous usage counters into a local file. Telemetry is opt-in and disabled by default. There is no remote endpoint.