Quick Start Guide๏ƒ

Welcome to cz-benchmarks! This guide will help you get started with installation, setup, and running your first benchmark in just a few steps.

Requirements๏ƒ

Before you begin, ensure you have the following installed:

  • ๐Ÿ Python 3.10+: Ensure you have Python 3.10 or later installed.

  • ๐Ÿ’ป Hardware: Intel/AMD64 architecture CPU with NVIDIA GPU, running Linux with NVIDIA drivers.

Installation๏ƒ

You can install the library using one of the following methods:

Option 2: Install from Source (For Development)๏ƒ

If you plan to contribute or debug the library, install it from source:

  1. Clone the repository:

    git clone https://github.com/chanzuckerberg/cz-benchmarks.git
    cd cz-benchmarks
    
  2. Install the package:

    pip install .
    
  3. For development, install in editable mode with development dependencies:

    pip install -e ".[dev]"
    

Command-Line Interface (CLI)๏ƒ

Use the cz-benchmarks CLI to list supported datasets and tasks.

  • List all available datasets:

    czbenchmarks list datasets
    
  • List all available tasks:

    czbenchmarks list tasks
    

For a full list of options, run:

czbenchmarks list --help

Running Benchmarks๏ƒ

The cz-benchmarks package is designed to be used programmatically within your Python workflow. The below code example demonstrates how you can run a benchmark task on a modelโ€™s output. The example generates a โ€œdummyโ€ cell embedding as the modelโ€™s output and computes a benchmark result using the clustering task.

Note that if you are interested in running benchmarks using a CLI instead of code, you can use the Virtual Cell Platform CLI. The VCP CLI supports:

  • Running a benchmark on any Virtual Cell Platform model that has been benchmarked using a cz-benchmark dataset, allowing you to reproduce a VCP-published result.

  • Running a benchmark on any Virtual Cell Platform model that has been benchmarked using your own benchmarking dataset.

  • Running a benchmark on the output of your own model using either a cz-benchmark dataset or your own benchmarking dataset.

import numpy as np
from czbenchmarks.datasets import load_dataset
from czbenchmarks.tasks import ClusteringTask, ClusteringTaskInput

# 1. Load a benchmark dataset
# This dataset has pre-defined labels we can use for evaluation.
dataset = load_dataset("tsv2_bladder")

# 2. Generate or load your model's cell embedding
# For this example, we'll generate a **dummy embedding**.
# In a real scenario, this would be the output of your ML Model.
n_obs = dataset.adata.n_obs
n_features = 128
my_model_embedding = np.random.rand(n_obs, n_features)

# 3. Instantiate the desired evaluation task
# We'll use the ClusteringTask to see how well the embedding
# separates known cell types.
clustering_task = ClusteringTask()

# 4. Prepare the input for the task
# The task needs the ground-truth labels from the dataset.
task_input = ClusteringTaskInput(
    obs=dataset.adata.obs,
    input_labels=dataset.labels
)

# 5. Run the task and get the results
# The task evaluates your embedding against the ground-truth labels.
results = clustering_task.run(
    cell_representation=my_model_embedding,
    task_input=task_input
)

# 6. Print the results
# The output will contain metrics like Adjusted Rand Index (ARI)
# and Normalized Mutual Information (NMI).
print(results)

Next Steps๏ƒ

Explore the following resources to deepen your understanding:

Happy benchmarking! ๐Ÿš€