# Quick Start Guide Welcome to **cz-benchmarks**! This guide will help you get started with installation, setup, and running your first benchmark in just a few steps. ## Requirements Before you begin, ensure you have the following installed: - 🐍 **[Python 3.10+](https://www.python.org/downloads/)**: 3.10+**: Ensure you have Python 3.10 or later installed. - 🐳 **[Docker](https://docs.docker.com/get-started/get-docker/)**: Required for container-based execution. - 💻 **Hardware**: Intel/AMD64 architecture CPU with NVIDIA GPU, running Linux with [NVIDIA drivers](https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/index.html). ## Installation You can install the library using one of the following methods: ### Option 1: Install from [PyPI](https://pypi.org/project/cz-benchmarks/) (Recommended) The easiest way to install the library is via PyPI: ```bash pip install cz-benchmarks ``` ### Option 2: Install from Source (For Development) If you plan to contribute or debug the library, install it from source: 1. Clone the repository: ```bash git clone https://github.com/chanzuckerberg/cz-benchmarks.git cd cz-benchmarks ``` 2. Install the package: ```bash pip install . ``` 3. For development, install in editable mode with development dependencies: ```bash pip install -e ".[dev]" ``` ## Running Benchmarks You can run benchmarks using the CLI or programmatically in Python. ### 💻 Using the CLI The CLI simplifies running benchmarks. Below are common commands: #### 🔍 List Available Benchmark Assets ```bash czbenchmarks list models czbenchmarks list datasets czbenchmarks list tasks ``` #### 🏃 Run a Benchmark ```bash czbenchmarks run \ --models SCVI \ --datasets tsv2_bladder \ --tasks clustering \ --label-key cell_type \ --output-file results.json ``` #### 🔧 CLI Run Options Below are the key options available for running benchmarks via the CLI: - **`--models`**: Specifies the model to use (e.g., `SCVI`). - **`--datasets`**: Specifies the dataset to benchmark (e.g., `tsv2_bladder`). - **`--tasks`**: Defines the evaluation task(s) to execute (e.g., `clustering`). - **`--label-key`**: The metadata key to use as labels for the task (e.g., `cell_type`). - **`--output-file`**: File path to save the benchmark results (e.g., `results.json`). > 💡 **Tip**: Combine these options to customize your benchmark runs effectively. > 📁 **Output**: Results will be saved to `results.json`. #### 📖 Get Help Use the `--help` flag to explore available commands and options: ```bash czbenchmarks --help czbenchmarks --help ``` ### 🐍 Using the Python API The library can also be used programmatically. Here's an example: ```python from czbenchmarks.datasets.utils import load_dataset from czbenchmarks.runner import run_inference from czbenchmarks.tasks import ClusteringTask # Load a dataset dataset = load_dataset("tsv2_bladder") # Run inference using the SCVI model dataset = run_inference("SCVI", dataset) # Perform clustering on the dataset clustering = ClusteringTask(label_key="cell_type") results = clustering.run(dataset) # Print the clustering results print(results) ``` ## Next Steps Explore the following resources to deepen your understanding: - **How-to Guides**: [Practical guides](./how_to_guides/index.rst) for using and extending the library. - **Setup Guides**: [Setup Guides](./how_to_guides/setup_guides.md) - **Developer Docs**: [Internal structure and extension points](./developer_guides/index.rst). - **GitHub Repository**: [cz-benchmarks](https://github.com/chanzuckerberg/cz-benchmarks) for troubleshooting and support. Happy benchmarking! 🚀