Model
The VCP CLI model command on allows you to interact with models previously submitted to the Virtual Cells Platform.
In this page, we provide instructions on how to use the list and download models using the VCP CLI tool. In addition, we describe how to run downloaded models locally using MLflow.
Why Download Models?
You can use downloaded models to run inference on multiple datasets, inspect model behavior under different parameters, and obtain raw inference outputs for downstream analyses. This enables you to run models on your own compute and test models beyond pre-defined tasks and datasets provided through the cz-benchmarks package.
Getting Started
Prerequisites
To use the VCP CLI model commands, ensure you have the following prerequisites installed:
Python version >= 3.10, <= 3.13
The VCP CLI tool. See Installation for instructions.
If you plan on running downloaded models locally, we recommend installing uv to manage your virtual environment. You will also need to install MLflow to run model inference. In addition, you may need to review the requirements for the model(s) of interest, including hardware and software dependencies.
See Run Downloaded Models for more details.
Get Help Using the CLI
The --help option provides additional documentation and tips. You can add it to the end of any of the available commands for more information.
For example, to learn what model commands are available for this tool, run:
vcp model --help
The VCP CLI has 2 core model commands:
Command |
Description |
|---|---|
|
List all download-enabled models with their versions and variants. |
|
Download a specific model version and variant to your local filesystem. |
Note
You do not need to be logged in to list or download models.
List Models
List all available download-enabled models, displaying their names, versions, and available variants.
Basic Usage
vcp model list
Options
Option |
Description |
Default |
|---|---|---|
|
Output format: |
|
Examples
List models in table format:
vcp model list
Output models as JSON:
vcp model list --format json
Output
The table output displays:
Model Name: The identifier for the model
Version: Available versions (e.g., v1, v2, 2024-01-15)
Variants: Available variants for multi-variant models (e.g., organism-specific versions like
homo_sapiens, mus_musculus)
Download Models
Download a specific model version and variant to your local filesystem.
Basic Usage
vcp model download --model <MODEL_NAME> --version <MODEL_VERSION>
Required Options
Option |
Description |
|---|---|
|
Name of the model to download |
|
Version of the model to download |
Optional Options
Option |
Description |
Default |
|---|---|---|
|
Directory to save the downloaded model |
Current working directory |
|
Variant name (e.g., |
Auto-selected if only one variant available |
Examples
Download a single-variant model:
vcp model download --model my-model --version v1 --output ./models
Download with specific variant:
vcp model download --model my-model --version v1 --variant homo_sapiens --output ./models
Understanding Variants
Some models are available in multiple variants (e.g., organism-specific versions).
If a model has only one variant, it will be selected automatically. You’ll see a message about the auto-selected variant. For example:
✓ Auto-selected variant: homo_sapiens
If multiple variants are available and
--variantis not specified, you’ll see a helpful panel listing available variants. Example output:╭─── Variant Selection Required ─────╮ │ Multiple variants available for │ │ my-model v1 │ │ │ │ Available variants: │ │ • homo_sapiens │ │ • mus_musculus │ │ │ │ Please specify a variant: │ │ vcp model download my-model v1 │ │ --variant <variant_name> │ ╰────────────────────────────────────╯
Output Structure
Downloaded models are saved to a directory with the following naming pattern:
Single variant:
{model}-{version}/Multi-variant:
{model}-{version}-{variant}/
Example: For a model called my-model, version v1, and variant homo_sapiens, the output directory structure will be:
./models/
└── my-model-v1-homo_sapiens/
├── model.tar.gz
└── metadata.yaml
Run Downloaded Models
To run downloaded models in your local system, you will need to:
review system requirements for model(s) of interest, including hardware and software dependencies
install MLflow
prepare an input JSON file
Below we provide details for each step, including commands to run model inference. In addition, we demonstrate how to run inference using a TranscripFormer model variant.
Review Model Requirements
Find hardware and software requirements for each model within the corresponding model card or GitHub repository to ensure the availability of compute resources. For example, some models may require GPUs with specific capabilities (e.g., CUDA version) or a minimum amount of RAM. In addition, you may need to install additional software packages or libraries depending on the model.
Install MLflow
To use a downloaded model, create a virtual environment and install MLflow.
Tip
We recommend using uv to manage your virtual environment. This will ensure reproducibility while quickly managing complex dependencies.
Note that you can simply activate your virtual environment using source vcp-cli/bin/activate if you installed the VCP CLI using pip install 'vcp-cli[all]'.
# create virtual environment and install MLFlow
uv venv
uv pip install mlflow
Generate Input JSON File
Each model run requires an input JSON file that specifies necessary parameters and a path to your dataset.
There are two options to create the input JSON file:
Option 1: After downloading a model, you will find a
serving_input_example.jsonfile within the model folder that includes the parameters necessary to run the model. Edit theserving_input_example.jsonfile directly to specify the dataset path underdataand, if needed, adjust the default parameters.Option 2: Create the input JSON file programmatically. You can start with the example below, making sure to edit the
dataandparamsfields to specify dataset path and relevant parameters to your model, respectively.
# create and import input JSON file
input_json = {
"dataframe_split": {
"columns": [
"input_uri"
],
"data": [
[
f"{path_to_dataset}/filename.h5ad"
]
]
},
"params": {
"batch_size": 32,
"precision": "16-mixed",
"gene_col_name": "ensembl_id"
}
}
import json
with open("input.json", "w") as f:
json.dump(input_json, f)
The examples below show input JSON files specifying parameters for running TranscriptFormer and scVI. Note the difference under the params field.
TranscriptFormer Input JSON File
{
"dataframe_split": {
"columns": [
"input_uri"
],
"data": [
[
f"{path_to_dataset}/filename.h5ad"
]
]
},
"params": {
"batch_size": 32,
"precision": "16-mixed",
"gene_col_name": "ensembl_id"
}
}
scVI Input JSON File
{
"dataframe_split": {
"columns": [
"input_uri"
],
"data": [
[
f"{path_to_dataset}/filename.h5ad"
]
]
},
"params": {
"organism": "human",
"return_dist": true
}
}
Run Model Inference
Run model inference using the following command:
# Specify model download directory, output filename (JSON), and environment manager
mlflow models predict \
--model-uri <path_to_model_download_folder> \
--content-type json \
--input-path input.json \
--output-path <test_output.json> \
--env-manager <virtualenv-manager>
You can check the MLFlow documentation for more details about the mlflow models predict command. Note that the --env-manager flag supports virtualenv, conda, local and uv, and we recommend using uv.
Example workflow
If you are using uv/pip as recommended, the workflow to run model inference would include the following steps:
Step 1: Download model
vcp model download --model my-model --version v1 --output ./models
Step 2: Create input JSON file (named here ‘input.json’)
input_json = {
"dataframe_split": {
"columns": [
"input_uri"
],
"data": [
[
f"{path_to_dataset}/filename.h5ad"
]
]
},
"params": {
"batch_size": 32,
"precision": "16-mixed",
"gene_col_name": "ensembl_id"
}
}
import json
with open("input.json", "w") as f:
json.dump(input_json, f)
Step 3: Run model inference
mlflow models predict \
--model-uri ./models/my-model-v1 \
--content-type json \
--input-path input.json \
--output-path <test_output.json> \
--env-manager uv
Note
If you encounter issues while running inference, please refer to the appropriate model card for contact information.
Example: Running Inference with TranscriptFormer
In this example we list steps to run model inference with TranscriptFormer using uv and pip. We will download the TF-sapiens model variant to generate cell embeddings using human data. For more complex examples of what can be done using TranscriptFormer, see the quickstart and GitHub repository.
Step 1: Download model
TranscriptFormer has multiple variants depending on the type of data used for training. For this example, we will run inference on human data using the tf-sapiens variant.
# Download tf-sapiens model variant
vcp model download --model transcriptformer --version v0.6.0 --variant tf_sapiens --output <path_to_model_download_folder>
# Set directory to downloaded `transcriptformer-v0.6.0-tf_sapiens` folder
Step 2: Download dataset
In this example, we used human lung data from the Tabula Sapiens dataset. Click here to download the dataset. We saved the dataset in the transcriptformer-v0.6.0-tf_sapiens folder and named it TS_lung.h5ad.
Step 3: Create input JSON file
The following input.json file contains the necessary default parameters for running the TranscriptFormer model. The parameters include batch_size, precision, and gene_col_name variables. Note that we included the path to the dataset under the data field.
input_json = {
"dataframe_split": {
"columns": [
"input_uri"
],
"data": [
[
f"./TS_lung.h5ad"
]
]
},
"params": {
"batch_size": 32,
"precision": "16-mixed",
"gene_col_name": "ensembl_id"
}
}
import json
with open("input.json", "w") as f:
json.dump(input_json, f)
Step 4: Run inference
Use the following command to run inference and obtain cell embeddings:
mlflow models predict \
--model-uri . \
--content-type json \
--input-path input.json \
--output-path tf-sapiens-lung_output.json \
--env-manager uv
For more information about the capabilities of TranscriptFormer, see the Github documentation here.