Command Line Interface
vcp
VCP CLI - A command-line interface (CLI) to the Chan Zuckerberg Initiative’s Virtual Cell Platform (VCP)
vcp [OPTIONS] COMMAND [ARGS]...
benchmarks
View and run benchmarks available on the Virtual Cell Platform
vcp benchmarks [OPTIONS] COMMAND [ARGS]...
get
Fetch and display benchmark results from the API.
Use filters to select by model, dataset, or task. Choose to show model, baseline, or both metrics.
vcp benchmarks get [OPTIONS]
Options
- -b, --benchmark-key <benchmark_key>
Filter by benchmark key (substring match with ‘*’ wildcards, e.g. ‘scvi*v1-tsv2*liver-label*pred’)
- -m, --model-filter <model_filter>
Filter by model key (substring match with ‘*’ wildcards, e.g. ‘scvi*v1’)
- -d, --dataset-filter <dataset_filter>
Filter by dataset key (substring match with ‘*’ wildcards, e.g ‘tsv2*liver’)
- -t, --task-filter <task_filter>
Filter by task key (substring match with ‘*’ wildcards, e.g. ‘label*pred’)
- -f, --format <format>
Output format
- Options:
table | json
list
List available model, dataset and task benchmark combinations. You can filter results by dataset, model, or task using glob patterns.
vcp benchmarks list [OPTIONS]
Options
- -b, --benchmark-key <benchmark_key>
Filter by benchmark key
- -m, --model-filter <model_filter>
Filter by model key (substring match with ‘*’ wildcards, e.g. ‘scvi*v1’).
- -d, --dataset-filter <dataset_filter>
Filter by dataset key (substring match with ‘*’ wildcards, e.g. ‘tsv2*liver’).
- -t, --task-filter <task_filter>
Filter by task key (substring match with ‘*’ wildcards, e.g. ‘label*pred’).
- -f, --format <format>
Output format
- Options:
table | json
run
Run a benchmark task on a model and dataset.
Use a VCP model (–model-key) or a precomputed cell representation (–cell-representation). Datasets can be VCP datasets (–dataset-key) or user datasets (–user-dataset).
vcp benchmarks run [OPTIONS]
Options
- -b, --benchmark-key <benchmark_key>
Shortcut for specifying model, dataset, and task together. Format: MODEL-DATASET-TASK (e.g., scvi_homo_sapiens-tsv2_blood-cell_type_annotation).
- -m, --model-key <model_key>
Model key from the registry (e.g., scvi_homo_sapiens).
- -d, --dataset-key <dataset_key>
Dataset key from czbenchmarks datasets(e.g., tsv2_blood). Can be used multiple times.
- -u, --user-dataset <user_dataset>
Path to a user-provided .h5ad file. Provide as a JSON string with keys: ‘dataset_class’, ‘organism’, and ‘path’. Example: ‘{“dataset_class”: “czbenchmarks.datasets.SingleCellLabeledDataset”, “organism”: “HUMAN”, “path”: “~/mydata.h5ad”}’. Can be used multiple times.
- -t, --task-key <task_key>
Benchmark task to run (choose from available tasks).
- -c, --cell-representation <cell_representation>
Path to precomputed cell embeddings (.npy file) or AnnData reference (e.g., ‘@X’, ‘@obsm:X_pca’). Can be used multiple times.
- -l, --baseline-args <baseline_args>
JSON string with parameters for the baseline computation.
- -r, --random-seed <random_seed>
Set a random seed for reproducibility.
- -n, --no-cache
Disable caching. Forces all steps to run from scratch.
- --labels <labels>
The .obs column with cell type labels. Supports both column name (‘cell_type’) and reference (‘@obs:cell_type’) formats (default: ‘cell_type’).
- --use-rep <use_rep>
[Clustering Task] Representation to use for clustering (default: ‘X’).
- --n-iterations <n_iterations>
[Clustering Task] Number of Leiden algorithm iterations (default: 2).
- --flavor <flavor>
[Clustering Task] Flavor of Leiden algorithm (default: ‘igraph’).
- Options:
leidenalg | igraph
- --key-added <key_added>
[Clustering Task] Key for storing cluster assignments (default: ‘leiden’).
- --n-folds <n_folds>
[Label Prediction Task] Number of cross-validation folds (default: 5).
- --min-class-size <min_class_size>
[Label Prediction Task] Minimum samples per class for inclusion (default: 10).
- --batch-column, --batch-labels <batch_column>
[Batch Integration Task] The .obs column with batch information (default: ‘batch’).
- --cross-species-organisms <cross_species_organisms>
[Cross-Species] Organism name (e.g., ‘homo_sapiens:ENSG’). Repeat for each dataset in order.
cache
Manage vcp-cli cache and upload state.
The cache is used for embeddings, benchmark results, uploads (to allow resuming upon interruption), and new version checks
vcp cache [OPTIONS] COMMAND [ARGS]...
clean-uploads
Clean specific upload states from cache.
vcp cache clean-uploads [OPTIONS]
Options
- -m, --model <model>
Clean uploads for specific model
- -v, --version <version>
Clean uploads for specific version
- -f, --force
Skip confirmation prompt
clear
Clear cache data.
vcp cache clear [OPTIONS]
Options
- -t, --type <type>
Type of cache to clear
- Options:
all | uploads | benchmarks | version
- -f, --force
Skip confirmation prompt
history
Show upload history from cache.
vcp cache history [OPTIONS]
Options
- -l, --limit <limit>
Number of recent uploads to show
- -v, --verbose
Show detailed information
info
Show cache information and statistics.
vcp cache info [OPTIONS]
config
Print the current configuration.
vcp config [OPTIONS]
Options
- -c, --config <config>
Path to config file
- -f, --format <format>
Output format
- Options:
yaml | json
- --show-secrets, --hide-secrets
Show sensitive information
- -i, --init
Initialize a configuration if one does not exist
data
Data-related commands
vcp data [OPTIONS] COMMAND [ARGS]...
Data usage workflow
List searchable fields: vcp data search –help
Survey datasets based on searchable field: vcp data summary $FIELD
Search for datasets: vcp data search ‘$FIELD:$VALUE’
3.a … Restrict search: vcp data search ‘$FIELD:$VALUE_1 AND $FIELD:$VALUE_2’
3.b … Expand search: vcp data search ‘($FIELD:$VALUE_1 OR $FIELD:$VALUE_3 ) AND ($FIELD:$VALUE_3)’
4 Describe a single dataset: vcp data describe $DATASET_ID
5 Download:
5.a … a single dataset: vcp data download $DATASET_ID
5.b … multiple datasets matching search: vcp data search $QUERY –download
credentials
Get the credentials for a specific dataset by id. If you do not know the id, first use the search command to find the id.
vcp data credentials [OPTIONS] DATASET_ID
Arguments
- DATASET_ID
Required argument
describe
Describe a dataset with comprehensive metadata in tabular format.
Displays: • Basic Information (name, version, license, owner, DOI) • Biological Metadata (assays, organisms, tissues, diseases, etc.) • Distribution/Assets (download locations and formats)
vcp data describe [OPTIONS] DATASET_ID
Options
- --full
Show the complete DatasetRecord as pretty-printed JSON and exit.
- --raw
Show the raw returned record.
Arguments
- DATASET_ID
Required argument
download
Download dataset(s) by ID or search query. At least one of –id or –query is required.
vcp data download [OPTIONS]
Options
- --id <dataset_id>
Dataset ID to download a single dataset
- -q, --query <query>
Search query to download multiple datasets
- -o, --outdir <outdir>
Directory to write the files.
- -e, --exact
Use exact match when passing –query
Examples:
Download a SINGLE dataset by its exact ID
vcp data download –id $DATASET_ID
Download MULTIPLE datasets matching a search query
vcp data download –query $QUERY
… equivalent to vcp data search $QUERY –download
metadata-list
List available metadata fields for searching datasets.
vcp data metadata-list [OPTIONS]
preview
Generate a Neuroglancer preview URL for a dataset with zarr files.
DATASET_ID: The ID of the dataset to preview
vcp data preview [OPTIONS] DATASET_ID
Options
- --open
Automatically open the preview URL in your browser
Arguments
- DATASET_ID
Required argument
search
Search for authorized datasets by TERM.
TERM can be a single word or a phrase in single quotes (e.g., microscopy or ‘brain tissue’)
vcp data search [OPTIONS] TERM
Options
- --download
Download every dataset returned
- --full
Show full details for each dataset as a small table
- -o, --outdir <outdir>
Directory for downloaded files (used with –download).
- --exact
Match term exactly (no partial matches)
Arguments
- TERM
Required argument
Examples:
Use single quotes (’) around TERM that contains spaces
vcp data search ‘caudate lobe of liver’
Search by domain/topic
vcp data search domain:transcriptomics
vcp data search domain:microscopy
Search all datasets on a biological field (see below for details)
vcp data search ‘assay:Slide-seqV2’
Search for ontology terms; escape colons with a backslash (:)
vcp data search ‘assay_ontology_term_id:EFO:0030062’
Search for exact match to search TERM
vcp data search ‘caudate lobe of liver’ –exact
Download all datasets matching search TERM:
vcp data search ‘tissue:hard palate’ –download
… equivalent to: vcp data download –query ‘tissue:hard palate’
Combine –exact and –download:
vcp data search ‘bed nucleus of stria terminalis’ –exact
… equivalent to: vcp data download –query ‘bed nucleus of stria terminalis’ –exact
HINT: use vcp data metadata-list to see all searchable fields and their explanations.
HINT: use vcp data summary $FIELD to get a count of field values.
summary
Summarize counts of matched datasets against a specified FIELD.
vcp data summary [OPTIONS] {assay|assay_ontology_term_id|tissue|tissue_ontolog
y_term_id|organism|organism_ontology_term_id|disease|disease_
ontology_term_id|tissue_type|cell_type|development_stage|deve
lopment_stage_ontology_term_id}
Options
- -q, --query <query>
Search query to filter datasets.
Arguments
- FIELD
Required argument
Examples:
Summarize by a Cross Modality field
vcp data summary assay
vcp data summary assay_ontology_term_id
Filter Summary by an additional search query
vcp data summary assay –query brain
login
Login to the Virtual Cell Platform
vcp login [OPTIONS]
Options
- -c, --config <config>
Path to configuration file
- -f, --force
Force login even if valid tokens exist
- -v, --verbose
Enable verbose output
- -u, --username <username>
Username for direct login
logout
Logout from the Virtual Cell Platform
vcp logout [OPTIONS]
Options
- -c, --config <config>
Path to configuration file
- -v, --verbose
Enable verbose output
model
Manage models in the Virtual Cell Platform.
Available commands:
vcp model [OPTIONS] COMMAND [ARGS]...
assist
Check model submission status and get guidance for VCP model commands.
This command helps you understand where you are in the model submission process and what the next steps should be.
vcp model assist [OPTIONS]
Options
- --work-dir <work_dir>
Path to the model repository work directory (defaults to current directory)
- --step <step>
Get detailed guidance for a specific workflow step
- Options:
init | status | metadata | weights | package | stage | submit
- -c, --config <config>
Path to config file
download
Download a specific version of a model using presigned S3 URLs.
vcp model download [OPTIONS]
Options
- --model <model>
Required Name of the model to download
- --version <version>
Required Version of the model to download
- --output <output>
Required Directory to save the downloaded model
- -c, --config <config>
Path to config file
- --timeout <timeout>
Timeout in seconds for download (default: 1800)
- -v, --verbose
Show detailed debug information
- --max-workers <max_workers>
Maximum number of concurrent downloads (default: 4)
init
Initialize a new model in the VCP Model Hub API.
Runs in interactive mode by default when no parameters are provided. Use –work-dir to specify where to initialize the model repository
vcp model init [OPTIONS]
Options
- --model-name <model_name>
Name of the model to initialize.
- --model-version <model_version>
Version of the model to initialize.
- --license-type <license_type>
License type for the model (e.g., MIT, Apache-2.0).
- --work-dir <work_dir>
Path to the model repository work directory where the model will be initialized.
- --data-file <data_file>
JSON file containing template data to prefill answers.
- --interactive
Run in interactive mode to prompt for required parameters (default when no parameters provided).
- --workflow-help
Show workflow guidance and next steps after initialization.
- -v, --verbose
Enable verbose output for debugging and detailed information.
- --debug
Enable debug output (includes sensitive information - use with caution).
- --debug-file <debug_file>
Write debug output to file (automatically masks sensitive data for safe sharing).
- --iteration
Indicate this is an iteration on a previously submitted model.
- --skip-git
Skip git operations (clone, pull, branch creation) for faster initialization.
list
List available models and their versions from the VCP Model Hub API.
vcp model list [OPTIONS]
Options
- --format <format>
Output format (table or json)
- Options:
table | json
- -c, --config <config>
Path to config file
- -v, --verbose
Show detailed debug information
stage
Stage model files for upload to the VCP Model Hub.
vcp model stage [OPTIONS]
Options
- --model <model>
Name of the model
- --version <version>
Version of the model
- --data-path <data_path>
Directory path containing model files
- --work-dir <work_dir>
Path to the model repository work directory where model configuration is located
- -c, --config <config>
Path to config file
- -v, --verbose
Show detailed debug information
- --max-retries <max_retries>
Maximum number of retry attempts per file (default: 3)
- --no-resume
Start fresh without resuming from previous upload attempt
- --clean-state
Clean previous upload state and start fresh
- --batch-upload
[DEPRECATED] Batch upload is now the default behavior
- --interactive
Run in interactive mode to prompt for required parameters.
status
Query and display the status of all model submissions from the VCP Model Hub.
This command shows the current status of all models that have been initialized or submitted to the VCP Model Hub.
vcp model status [OPTIONS]
Options
- --format <format>
Output format (table or json)
- Options:
table | json
- -c, --config <config>
Path to config file
- -v, --verbose
Show detailed debug information
- --work-dir <work_dir>
Path to the model repository work directory where model configuration is located
submit
Submit model for review with comprehensive validation.
vcp model submit [OPTIONS]
Options
- -c, --config <config>
Path to config file
- -v, --verbose
Show detailed debug information
- --work-dir <work_dir>
Path to the model repository work directory where model configuration is located
- --skip-git
Skip submission operations
validate-model-metadata
Validates model metadata files against requirements in vcp-model-hub.
vcp model validate-model-metadata [OPTIONS]
Options
- --yaml-metadata-file <yaml_metadata_file>
Path to the YAML metadata file (default: model_card_metadata.yaml)
- --markdown-details-file <markdown_details_file>
Path to the Markdown details file (default: model_card_details.md)
- -c, --config <config>
Path to config file
- -v, --verbose
Show detailed debug information
version
Show version information and optionally check for updates.
vcp version [OPTIONS]
Options
- --check
Check for available updates on PyPI