Command Line Interface

vcp

VCP CLI - A command-line interface (CLI) to the Chan Zuckerberg Initiative’s Virtual Cells Platform (VCP)

vcp [OPTIONS] COMMAND [ARGS]...

benchmarks

View and run benchmarks available on the Virtual Cells Platform

vcp benchmarks [OPTIONS] COMMAND [ARGS]...

get

Fetch and display benchmark results with metrics.

Shows benchmarks from the VCP as well as locally cached results from previous user benchmark runs. Use filters to select by model, dataset, or task. Results include detailed performance metrics for each benchmark.

vcp benchmarks get [OPTIONS]

Options

-b, --benchmark-key <benchmark_key>: Retrieve by benchmark key (exact match). Mutually-exclusive with filter options.

-m, --model-filter <model_filter>: Filter by model key (substring match with ‘*’ wildcards, e.g. ‘scvi*v1’)

-d, --dataset-filter <dataset_filter>: Filter by dataset key (substring match with ‘*’ wildcards, e.g ‘tsv2*liver’)

-t, --task-filter <task_filter>: Filter by task key (substring match with ‘*’ wildcards, e.g. ‘label*pred’)

-f, --format <format>

Output format

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--user-runs, --no-user-runs: Include or exclude locally cached results from previous user benchmark runs (default: include).

list

List available model, dataset and task benchmark combinations.

Shows benchmarks from the VCP as well as locally cached results from previous user benchmark runs. You can filter results by dataset, model, or task using glob patterns.

vcp benchmarks list [OPTIONS]

Options

-b, --benchmark-key <benchmark_key>: Retrieve by benchmark key. Mutually-exclusive with filter options.

-m, --model-filter <model_filter>: Filter by model key (substring match with ‘*’ wildcards, e.g. ‘scvi*v1’).

-d, --dataset-filter <dataset_filter>: Filter by dataset key (substring match with ‘*’ wildcards, e.g. ‘tsv2*liver’).

-t, --task-filter <task_filter>: Filter by task key (substring match with ‘*’ wildcards, e.g. ‘label*pred’).

-f, --format <format>

Output format

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--user-runs, --no-user-runs: Include or exclude locally cached results from previous user benchmark runs (default: include).

run

Run a benchmark task on a cell representation, which can be provided in one of the following ways: 1) generate a cell representation by performing model inference on a specified dataset, using a specified model, or 2) specify use a previously-computed cell representation (skips performing model inference), or 3) have the task generate a baseline cell representation that is computed from a specified dataset.

Use vcp benchmarks run <task> –help to see all available options for that task.

vcp benchmarks run [OPTIONS] COMMAND [ARGS]...

Options

-b, --benchmark-key <benchmark_key>: Run a benchmark using the model, dataset, and task of a VCP-published benchmark (run vcp benchmarks list for available benchmark keys).

batch_integration

Task for evaluating batch integration quality.

This task computes metrics to assess how well different batches are integrated in the embedding space while preserving biological signals.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run batch_integration [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--batch-labels <batch_labels>: Batch labels for each cell (e.g. obs.batch from an AnnData object). Supports AnnData reference syntax (e.g. ‘@obs:batch’).

--labels <labels>: Ground truth labels for metric calculation (e.g. obs.cell_type from an AnnData object). Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--baseline-n-top-genes <baseline_n_top_genes>: Number of highly variable genes for PCA baseline. [Default: 3000] [Required: False]

--baseline-n-pcs <baseline_n_pcs>: Number of principal components for PCA baseline. [Default: 50] [Required: False]

--baseline-obsm-key <baseline_obsm_key>: AnnData .obsm key to store the baseline PCA embedding. [Default: emb] [Required: False]

clustering

Task for evaluating clustering performance against ground truth labels.

This task performs clustering on embeddings and evaluates the results using multiple clustering metrics (ARI and NMI).

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run clustering [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--obs <obs>: Cell metadata DataFrame (e.g. the obs from an AnnData object). [Default: None] [Required: True] Supports AnnData reference syntax (e.g. ‘@obs’).

--input-labels <input_labels>: Ground truth labels for metric calculation (e.g. obs.cell_type from an AnnData object). Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--use-rep <use_rep>: Data representation to use for clustering (e.g. the ‘X’ or obsm[‘X_pca’] from an AnnData object). [Default: X] [Required: False] Supports AnnData reference syntax (e.g. ‘X’).

--n-iterations <n_iterations>: Number of iterations for the Leiden algorithm. [Default: 2] [Required: False]

--flavor <flavor>

Algorithm for Leiden community detection. [Default: igraph] [Required: False] [Options : ‘leidenalg’, ‘igraph’]

Options:: leidenalg | igraph

--key-added <key_added>: Key in AnnData.obs where cluster assignments are stored. [Default: leiden] [Required: False]

--baseline-n-top-genes <baseline_n_top_genes>: Number of highly variable genes for PCA baseline. [Default: 3000] [Required: False]

--baseline-n-pcs <baseline_n_pcs>: Number of principal components for PCA baseline. [Default: 50] [Required: False]

--baseline-obsm-key <baseline_obsm_key>: AnnData .obsm key to store the baseline PCA embedding. [Default: emb] [Required: False]

cross-species_integration

Task for evaluating cross-species integration quality.

This task computes metrics to assess how well different species’ data are integrated in the embedding space while preserving biological signals. It operates on multiple datasets from different species.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run cross-species_integration [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--labels <labels>: List of ground truth labels for each species dataset (e.g., cell types). Can be specified multiple times. Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--organism-list <organism_list>: List of organisms corresponding to each dataset for cross-species evaluation. Can be specified multiple times.

cross-species_label_prediction

Task for cross-species label prediction evaluation.

This task evaluates cross-species transfer by training classifiers on one species and testing on another species. It computes accuracy, F1, precision, recall, and AUROC for multiple classifiers (Logistic Regression, KNN, Random Forest).

The task can optionally aggregate cell-level embeddings to sample/donor level before running classification.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run cross-species_label_prediction [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--labels <labels>: List of ground truth labels for each species dataset (e.g., cell types). Can be specified multiple times. Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--organisms <organisms>: List of organisms corresponding to each dataset for cross-species evaluation. Can be specified multiple times.

--sample-ids <sample_ids>: Optional list of sample/donor IDs for aggregation, one per dataset. Can be specified multiple times. Supports AnnData reference syntax (e.g. ‘None’).

--aggregation-method <aggregation_method>

Method to aggregate cells with the same sample_id (‘none’, ‘mean’, or ‘median’). [Default: mean] [Required: False] [Options : ‘none’, ‘mean’, ‘median’]

Options:: none | mean | median

--n-folds <n_folds>: Number of cross-validation folds for intra-species evaluation. [Default: 5] [Required: False]

embedding

Task for evaluating cell representation quality using labeled data.

This task computes quality metrics for cell representations using ground truth labels. Currently supports silhouette score evaluation.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run embedding [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--input-labels <input_labels>: Ground truth labels for metric calculation (e.g. obs.cell_type from an AnnData object). Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--baseline-n-top-genes <baseline_n_top_genes>: Number of highly variable genes for PCA baseline. [Default: 3000] [Required: False]

--baseline-n-pcs <baseline_n_pcs>: Number of principal components for PCA baseline. [Default: 50] [Required: False]

--baseline-obsm-key <baseline_obsm_key>: AnnData .obsm key to store the baseline PCA embedding. [Default: emb] [Required: False]

label_prediction

Task for predicting labels from embeddings using cross-validation.

Evaluates multiple classifiers (Logistic Regression, KNN) using k-fold cross-validation. Reports standard classification metrics.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run label_prediction [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--labels <labels>: Ground truth labels for prediction (e.g. obs.cell_type from an AnnData object) Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--n-folds <n_folds>: Number of folds for stratified cross-validation. [Default: 5] [Required: False]

--min-class-size <min_class_size>: Minimum number of samples required for a class to be included in evaluation. [Default: 10] [Required: False]

perturbation_expression_prediction

Task for evaluating perturbation-induced expression predictions against their ground truth values. This is done by calculating metrics derived from predicted and ground truth log fold change values for each condition. Currently, Spearman rank correlation is supported.

The following arguments are required and must be supplied by the task input class (PerturbationExpressionPredictionTaskInput) when running the task. These parameters are described below for documentation purposes:

predictions_adata (ad.AnnData):
The anndata containing model predictions
dataset_adata (ad.AnnData):
The anndata object from SingleCellPerturbationDataset.
pred_effect_operation (Literal[“difference”, “ratio”]):
How to compute predicted effect between treated and control mean predictions over genes.
- “ratio” uses $\log\left(\frac{\text{mean}(\text{treated}) + \varepsilon}{\text{mean}(\text{control}) + \varepsilon}\right)$ when means are positive.
- “difference” uses $\text{mean}(\text{treated}) - \text{mean}(\text{control})$ and is generally safe across scales (probabilities, z-scores, raw expression).
Default is “ratio”.
gene_index (Optional[pd.Index]):
The index of the genes in the predictions AnnData.
cell_index (Optional[pd.Index]):
The index of the cells in the predictions AnnData.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run perturbation_expression_prediction [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--adata <adata>: AnnData object from SingleCellPerturbationDataset containing perturbation data and metadata. [Default: None] [Required: True]

--pred-effect-operation <pred_effect_operation>

Method to compute predicted effect: ‘difference’ (mean(treated) - mean(control)) or ‘ratio’ (log ratio of means). [Default: ratio] [Required: False] [Options : ‘difference’, ‘ratio’]

Options:: difference | ratio

--gene-index <gene_index>: Optional gene index for predictions to align model predictions with dataset genes.

--cell-index <cell_index>: Optional cell index for predictions to align model predictions with dataset cells.

sequential_organization

Task for evaluating sequential consistency in embeddings.

This task computes sequential quality metrics for embeddings using time point labels. Evaluates how well embeddings preserve sequential organization between cells.

Specify one of –model-key, –cell-representation, or –compute-baseline to generate or provide the benchmarked cell representation to the task.

Specify one of –dataset-key or –user-dataset to specify the associated dataset file(s) that contain ground truth data needed by the task for evaluation. These dataset options may be specified multiple times for multi-dataset tasks.

If –model-key is specified, dataset(s) will provide the input data to the model. If –compute-baseline is specified, dataset(s) will be used to compute a baseline cell representation. If –cell-representation is specified, a dataset is only used if task-specific option arguments reference ground truth data within the dataset.

vcp benchmarks run sequential_organization [OPTIONS]

Options

-m, --model-key <model_key>: Model key (e.g. SCVI-v1-homo_sapiens; run vcp benchmarks list for available model keys).

-d, --dataset-key <dataset_key>: Dataset key from czbenchmarks datasets (e.g., tsv2_blood; run czbenchmarks list datasets for available dataset keys). Can be used multiple times.

-u, --user-dataset <user_dataset>: Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.

-c, --cell-representation <cell_representation>: Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.

-B, --compute-baseline: Compute baseline for comparison. Cannot be used with –model-key or –cell-representation.

-r, --random-seed <random_seed>: Set a random seed for reproducibility.

-n, --no-cache: Disable caching. Forces all steps to run from scratch.

-f, --format <format>

Output format (default: table).

Options:: table | json

--fit, --full: Column display for table format (default: fit). Use –full to show full column content; pair with a pager like ‘less -S’ for horizontal scrolling. Only applies to –format=table.

--use-gpu, --no-use-gpu: Enable GPU support for model inference (default: enabled).

--obs <obs>: Cell metadata DataFrame (e.g. the obs from an AnnData object). [Default: None] [Required: True] Supports AnnData reference syntax (e.g. ‘@obs’).

--input-labels <input_labels>: Ground truth labels for metric calculation (e.g. obs.cell_type from an AnnData object). Supports AnnData reference syntax (e.g. ‘@obs:cell_type’).

--k <k>: Number of nearest neighbors for k-NN based metrics. [Default: 15] [Required: False]

--normalize: Whether to normalize the embedding for k-NN based metrics.

--adaptive-k: Whether to use an adaptive number of nearest neighbors for k-NN based metrics.

--baseline-n-top-genes <baseline_n_top_genes>: Number of highly variable genes for PCA baseline. [Default: 3000] [Required: False]

--baseline-n-pcs <baseline_n_pcs>: Number of principal components for PCA baseline. [Default: 50] [Required: False]

--baseline-obsm-key <baseline_obsm_key>: AnnData .obsm key to store the baseline PCA embedding. [Default: emb] [Required: False]

system-check

Display system hardware information for running VCP benchmarks.

Shows current system specifications including RAM, GPUs, CUDA version, and Docker availability compared against baseline requirements.

vcp benchmarks system-check [OPTIONS]

Options

-v, --verbose: Display detailed NVIDIA GPU diagnostic information for troubleshooting.

cache

Manage vcp-cli cache and upload state.

The cache is used for embeddings, benchmark results, uploads (to allow resuming upon interruption), and new version checks

Examples:

vcp cache info                    # Show cache information
vcp cache clear –type uploads    # Clear upload states
vcp cache history –limit 5       # Show last 5 uploads
vcp cache clean-uploads -m model1 -v 1.0  # Clean specific upload

vcp cache [OPTIONS] COMMAND [ARGS]...

clean-uploads

Clean specific upload states from cache.

vcp cache clean-uploads [OPTIONS]

Options

-m, --model <model>: Clean uploads for specific model

-v, --version <version>: Clean uploads for specific version

-f, --force: Skip confirmation prompt

clear

Clear cache data.

vcp cache clear [OPTIONS]

Options

-t, --type <type>

Type of cache to clear

Options:: all | uploads | benchmarks | version

-f, --force: Skip confirmation prompt

history

Show upload history from cache.

vcp cache history [OPTIONS]

Options

-l, --limit <limit>: Number of recent uploads to show

-v, --verbose: Show detailed information

info

Show cache information and statistics.

vcp cache info [OPTIONS]

config

Print the current configuration.

vcp config [OPTIONS]

Options

-c, --config <config>: Path to config file

-f, --format <format>

Output format

Options:: yaml | json

--show-secrets, --hide-secrets: Show sensitive information

-i, --init: Initialize a configuration if one does not exist

data

Data-related commands

vcp data [OPTIONS] COMMAND [ARGS]...

Data usage workflow

List searchable fields: vcp data metadata-list
Survey datasets based on searchable field: vcp data summary $FIELD
Search for datasets: vcp data search ‘$FIELD:$VALUE’

3.a … Restrict search: vcp data search ‘$FIELD:$VALUE_1 AND $FIELD:$VALUE_2’

3.b … Expand search: vcp data search ‘($FIELD:$VALUE_1 OR $FIELD:$VALUE_2) AND ($FIELD:$VALUE_3)’

4 Describe a single dataset: vcp data describe $DATASET_ID

5 Download:

5.a … a single dataset: vcp data download –id $DATASET_ID

5.b … multiple datasets matching search: vcp data search $QUERY –download

credentials

Get the credentials for a specific dataset by id. If you do not know the id, first use the search command to find the id.

vcp data credentials [OPTIONS] DATASET_ID

Arguments

DATASET_ID: Required argument

describe

Describe a dataset with comprehensive metadata in tabular format.

Displays: • Basic Information (name, version, license, owner, DOI) • Biological Metadata (assays, organisms, tissues, diseases, etc.) • Distribution/Assets (download locations and formats)

vcp data describe [OPTIONS] DATASET_ID

Options

--full: Show the complete DatasetRecord as pretty-printed JSON and exit.

--raw: Show the raw returned record.

Arguments

DATASET_ID: Required argument

download

Download dataset(s) by ID or search query. At least one of –id or –query is required.

vcp data download [OPTIONS]

Options

--id <dataset_id>: Dataset ID to download a single dataset

-q, --query <query>: Search query to download multiple datasets

-o, --outdir <outdir>: Directory to write the files.

-e, --exact: Use exact match when passing –query

Examples:

Download a SINGLE dataset by its exact ID

vcp data download –id $DATASET_ID
Download MULTIPLE datasets matching a search query

vcp data download –query $QUERY

… equivalent to vcp data search $QUERY –download

metadata-list

List available metadata fields for searching datasets.

vcp data metadata-list [OPTIONS]

preview

Generate a Neuroglancer preview URL for a dataset with zarr files.

DATASET_ID: The ID of the dataset to preview

Note: Preview is only available for microscopy datasets that contain zarr files. Use ‘vcp data describe DATASET_ID’ to check available file formats.

vcp data preview [OPTIONS] DATASET_ID

Options

--open: Automatically open the preview URL in your browser

Arguments

DATASET_ID: Required argument

search

Search for authorized datasets by TERM.

TERM can be a single word or a phrase in single quotes (e.g., microscopy or ‘brain tissue’)

vcp data search [OPTIONS] TERM

Options

--download: Download every dataset returned

--full: Show full details for each dataset as a small table

-o, --outdir <outdir>: Directory for downloaded files (used with –download).

--exact: Match term exactly (no partial matches)

--latest-version, --all-versions: Specifies if all or just the latest version of each dataset (if multiple versions exist) should be returned. Defaults to –latest-version.

Arguments

TERM: Required argument

Examples:

Use single quotes (’) around TERM that contains spaces

vcp data search ‘caudate lobe of liver’
Search by domain/topic

vcp data search domain:transcriptomics

vcp data search domain:microscopy
Search all datasets on a biological field (see below for details)

vcp data search ‘assay:Slide-seqV2’
Search for ontology terms; escape colons with a backslash (:)

vcp data search ‘assay_ontology_term_id:EFO:0030062’
Search for exact match to search TERM

vcp data search ‘caudate lobe of liver’ –exact
Download all datasets matching search TERM:

vcp data search ‘tissue:hard palate’ –download

… equivalent to: vcp data download –query ‘tissue:hard palate’
Combine –exact and –download:

vcp data search ‘bed nucleus of stria terminalis’ –exact –download

… equivalent to: vcp data download –query ‘bed nucleus of stria terminalis’ –exact

HINT: use vcp data metadata-list to see all searchable fields and their explanations.

HINT: use vcp data summary $FIELD to get a count of field values.

summary

Summarize counts of matched datasets against a specified FIELD.

vcp data summary [OPTIONS] {domain|namespace|assay|assay_ontology_term_id|tiss
                 ue|tissue_ontology_term_id|organism|organism_ontology_term_id
                 |disease|disease_ontology_term_id|tissue_type|cell_type|devel
                 opment_stage|development_stage_ontology_term_id}

Options

-q, --query <query>: Search query to filter datasets.

--latest-version, --all-versions: Specifies if all or just the latest version of each dataset (if multiple versions exist) should be counted. Defaults to –latest-version.

Arguments

FIELD: Required argument

Examples:

Summarize by a Cross Modality field

vcp data summary assay

vcp data summary assay_ontology_term_id
Filter Summary by an additional search query

vcp data summary assay –query brain

logout

Logout from the Virtual Cells Platform

vcp logout [OPTIONS]

Options

-c, --config <config>: Path to configuration file

-v, --verbose: Enable verbose output

model

Manage models in the Virtual Cells Platform.

Available commands:

• init                - Initialize a new model project
• list                - List available models
• download            - Download a specific model version
• submit              - Submit model metadata to the VCP Model Hub
• stage               - Stage model data files to Contributions Store
• status              - Query and display the status of all model submissions
• validate-metadata   - Validate metadata for a model
• assist              - Check model submission status and get step-by-step guidance

vcp model [OPTIONS] COMMAND [ARGS]...

assist

Check model submission status and get guidance for VCP model commands.

This command helps you understand where you are in the model submission process and what the next steps should be.

Examples:
• vcp model assist                    # Check current directory status
• vcp model assist –work-dir ./my-model  # Check specific work directory
• vcp model assist –step init        # Get init step guidance
• vcp model assist –step metadata    # Get metadata step guidance
• vcp model assist –step stage       # Get stage step guidance
• vcp model assist –step submit      # Get submit step guidance

vcp model assist [OPTIONS]

Options

--work-dir <work_dir>: Path to the model repository work directory (defaults to current directory)

--step <step>

Get detailed guidance for a specific workflow step

Options:: init | status | metadata | weights | package | stage | submit

-c, --config <config>: Path to config file

download

Download a specific version of a model from the model hub.

vcp model download [OPTIONS]

Options

--model <model>: Required Name of the model to download

--version <version>: Required Version of the model to download

--output <output>: Directory to save the downloaded model (default: current directory)

-c, --config <config>: Path to config file

-v, --verbose: Show detailed debug information

--variant <variant>: Variant name (e.g., homo_sapiens, mus_musculus)

--max-workers <max_workers>: Maximum number of concurrent downloads (default: 4)

init

Initialize a new model in the VCP Model Hub API.

Runs in interactive mode by default when no parameters are provided. Use –work-dir to specify where to initialize the model repository

vcp model init [OPTIONS]

Options

--model-name <model_name>: Name of the model to initialize.

--model-version <model_version>: Version of the model to initialize.

--license-type <license_type>: License type for the model (e.g., MIT, Apache-2.0).

--work-dir <work_dir>: Path to the model repository work directory where the model will be initialized.

--data-file <data_file>: JSON file containing template data to prefill answers.

--interactive: Run in interactive mode to prompt for required parameters (default when no parameters provided).

--workflow-help: Show workflow guidance and next steps after initialization.

-v, --verbose: Enable verbose output for debugging and detailed information.

--debug: Enable debug output (includes sensitive information - use with caution).

--debug-file <debug_file>: Write debug output to file (automatically masks sensitive data for safe sharing).

--iteration: Indicate this is an iteration on a previously submitted model.

--skip-git: Skip git operations (clone, pull, branch creation) for faster initialization.

list

List available models and their versions from the VCP Model Hub API.

vcp model list [OPTIONS]

Options

--format <format>

Output format (table or json)

Options:: table | json

-c, --config <config>: Path to config file

-v, --verbose: Show detailed debug information

stage

Stage model files for upload to the VCP Model Hub.

This command uploads files to S3 using the new model_data path structure:
- Files are uploaded to: s3://bucket/model/version/model_data/
- Creates .ptr pointer files with metadata for each uploaded file
- Deletes original files after successful upload
- Automatically resumes from previous failed upload attempts (use –no-resume to start fresh)
- Filters out hidden files (starting with .)

The upload process:
Scan directory and create initial pointer files
Get presigned URLs for each file
Upload files sequentially with immediate metadata updates
Delete original files after successful upload

Work Directory:
- Use –work-dir to specify the location of model configuration file
- If not provided, checks current directory first, then prompts for work directory
- The command will look for model_data directory within the work directory structure

Examples:
- vcp model stage –work-dir /path/to/model/repo
- vcp model stage –model my-model –version v1.0.0 –data-path ./model_data
- vcp model stage –work-dir /path/to/repo –verbose

vcp model stage [OPTIONS]

Options

--model <model>: Name of the model

--version <version>: Version of the model

--data-path <data_path>: Directory path containing model files

--work-dir <work_dir>: Path to the model repository work directory where model configuration is located

-c, --config <config>: Path to config file

-v, --verbose: Show detailed debug information

--max-retries <max_retries>: Maximum number of retry attempts per file (default: 3)

--no-resume: Start fresh without resuming from previous upload attempt

--clean-state: Clean previous upload state and start fresh

--batch-upload: [DEPRECATED] Batch upload is now the default behavior

--interactive: Run in interactive mode to prompt for required parameters.

--skip-packaging: Stage metadata only without packaging the model

status

Query and display the status of all model submissions from the VCP Model Hub.

This command shows the current status of all models that have been initialized or submitted to the VCP Model Hub.

Examples:
• vcp model status                    # Show all submissions in table format
• vcp model status –format json     # Show all submissions in JSON format
• vcp model status –verbose         # Show detailed debug information

vcp model status [OPTIONS]

Options

--format <format>

Output format (table or json)

Options:: table | json

-c, --config <config>: Path to config file

-v, --verbose: Show detailed debug information

--work-dir <work_dir>: Path to the model repository work directory where model configuration is located

submit

Submit model for review with comprehensive validation.

This command performs the following operations:
Validates that init command was run (model configuration file exists)
Validates metadata files format and required fields
Validates that stage command was run (only .ptr files in model_data)
Validates that no large files (>5GB) remain
Submits model data to VCP Model Hub API
Creates submission for model review

The command reads submission data from:

- model_card_docs/model_card_metadata.yaml file in the repository

Expected model card metadata structure:
- model_display_name: Model name for submission
- model_version: Version (vX.X.X or YYYY-MM-DD format)
- licenses.name: License type
- repository_link: Model repository URL
- authors: List of authors with name field
- model_description: Detailed model description

Work Directory:
- Use –work-dir to specify the location of model configuration file
- If not provided, checks current directory first, then prompts for work directory
- The command will look for model_data directory within the work directory structure

Examples:
- vcp model submit –work-dir /path/to/model/repo
- vcp model submit –work-dir /path/to/repo –skip-git
- vcp model submit –work-dir /path/to/repo –verbose

vcp model submit [OPTIONS]

Options

-c, --config <config>: Path to config file

-v, --verbose: Show detailed debug information

--work-dir <work_dir>: Path to the model repository work directory where model configuration is located

--skip-git: Skip submission operations

--skip-packaging: Submit metadata only without packaging the model

validate-model-metadata

Validates model metadata files against requirements in vcp-model-hub.

By default, looks for files in <work-dir>/model_card_docs/: - model_card_metadata.yaml - model_card_details.md

Examples:
• vcp model validate-metadata                         # Validate files in current directory
• vcp model validate-metadata –work-dir ./my-model   # Validate files in specific directory
• vcp model validate-metadata –verbose               # Show detailed validation info

vcp model validate-model-metadata [OPTIONS]

Options

--work-dir <work_dir>: Path to the model repository work directory (defaults to current directory)

-c, --config <config>: Path to config file

-v, --verbose: Show detailed debug information

version

Show version information and optionally check for updates.

vcp version [OPTIONS]

Options

--check: Check for available updates on PyPI

Command Line Interface

vcp

benchmarks

get

list

run

batch_integration

clustering

cross-species_integration

cross-species_label_prediction

embedding

label_prediction

perturbation_expression_prediction

sequential_organization

system-check

cache

clean-uploads

clear

history

info

config

data

credentials

describe

download

metadata-list

preview

search

summary

login

logout

model

assist

download

init

list

stage

status

submit

validate-model-metadata

version