Quick Start Guideο
Welcome to cz-benchmarks! This guide will help you get started with installation, setup, and running your first benchmark in just a few steps.
Requirementsο
Before you begin, ensure you have the following installed:
π Python 3.10+: 3.10+**: Ensure you have Python 3.10 or later installed.
π³ Docker: Required for container-based execution.
π» Hardware: Intel/AMD64 architecture CPU with NVIDIA GPU, running Linux with NVIDIA drivers.
Installationο
You can install the library using one of the following methods:
Option 1: Install from PyPI (Recommended)ο
The easiest way to install the library is via PyPI:
pip install cz-benchmarks
Option 2: Install from Source (For Development)ο
If you plan to contribute or debug the library, install it from source:
Clone the repository:
git clone https://github.com/chanzuckerberg/cz-benchmarks.git cd cz-benchmarks
Install the package:
pip install .
For development, install in editable mode with development dependencies:
pip install -e ".[dev]"
Running Benchmarksο
You can run benchmarks using the CLI or programmatically in Python.
π» Using the CLIο
The CLI simplifies running benchmarks. Below are common commands:
π List Available Benchmark Assetsο
czbenchmarks list models
czbenchmarks list datasets
czbenchmarks list tasks
π Run a Benchmarkο
czbenchmarks run \
--models SCVI \
--datasets tsv2_bladder \
--tasks clustering \
--label-key cell_type \
--output-file results.json
π§ CLI Run Optionsο
Below are the key options available for running benchmarks via the CLI:
--models
: Specifies the model to use (e.g.,SCVI
).--datasets
: Specifies the dataset to benchmark (e.g.,tsv2_bladder
).--tasks
: Defines the evaluation task(s) to execute (e.g.,clustering
).--label-key
: The metadata key to use as labels for the task (e.g.,cell_type
).--output-file
: File path to save the benchmark results (e.g.,results.json
).
π‘ Tip: Combine these options to customize your benchmark runs effectively.
π Output: Results will be saved to
results.json
.
π Get Helpο
Use the --help
flag to explore available commands and options:
czbenchmarks --help
czbenchmarks <command> --help
π Using the Python APIο
The library can also be used programmatically. Hereβs an example:
from czbenchmarks.datasets.utils import load_dataset
from czbenchmarks.runner import run_inference
from czbenchmarks.tasks import ClusteringTask
# Load a dataset
dataset = load_dataset("tsv2_bladder")
# Run inference using the SCVI model
dataset = run_inference("SCVI", dataset)
# Perform clustering on the dataset
clustering = ClusteringTask(label_key="cell_type")
results = clustering.run(dataset)
# Print the clustering results
print(results)
Next Stepsο
Explore the following resources to deepen your understanding:
How-to Guides: Practical guides for using and extending the library.
Setup Guides: Setup Guides
Developer Docs: Internal structure and extension points.
GitHub Repository: cz-benchmarks for troubleshooting and support.
Happy benchmarking! π