{ "cells": [ { "cell_type": "markdown", "id": "a6b9d9f4", "metadata": {}, "source": [ "## Using `cz-benchmarks`\n", "\n", "You may duplicate this notebook and replace the simulated model execution cell with your own model code.\n", "\n", "This notebook guides you through loading single-cell datasets, running your model, and evaluating results using standardized tasks and metrics.\n", "\n", "All you need to do is swap in your model’s output—no extra setup required.\n", "Use the provided examples as templates for your workflow." ] }, { "cell_type": "code", "execution_count": null, "id": "977eebcb", "metadata": {}, "outputs": [], "source": [ "# Setup you notebook kernel and install the package\n", "# Install czbenchmarks for the selected Jupyter kernel\n", "!pip install czbenchmarks" ] }, { "cell_type": "markdown", "id": "07679138", "metadata": {}, "source": [ "### 1. Datasets\n", "\n", "Datasets are wrapped for consistent loading and compatibility:\n", "\n", "- `SingleCellLabeledDataset`: Gene expression data with cell labels (supports clustering, embedding, label prediction).\n", "- `SingleCellPerturbationDataset`: Perturbation datasets with control and perturbed cells." ] }, { "cell_type": "code", "execution_count": 1, "id": "1f6678a2", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from czbenchmarks.datasets import load_dataset\n", "from czbenchmarks.datasets.single_cell_labeled import SingleCellLabeledDataset" ] }, { "cell_type": "markdown", "id": "497e612a", "metadata": {}, "source": [ "#### List Available Datasets \n", "\n", "This code snippet lists all available datasets in the czbenchmarks library." ] }, { "cell_type": "code", "execution_count": 2, "id": "acefbc6a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Dataset
chicken_spermatogenesis{'organism': 'gallus_gallus', 'url': 's3://cz-...
chimpanzee_spermatogenesis{'organism': 'pan_troglodytes', 'url': 's3://c...
gorilla_spermatogenesis{'organism': 'gorilla_gorilla', 'url': 's3://c...
human_spermatogenesis{'organism': 'homo_sapiens', 'url': 's3://cz-b...
marmoset_spermatogenesis{'organism': 'callithrix_jacchus', 'url': 's3:...
mouse_spermatogenesis{'organism': 'mus_musculus', 'url': 's3://cz-b...
opossum_spermatogenesis{'organism': 'monodelphis_domestica', 'url': '...
platypus_spermatogenesis{'organism': 'ornithorhynchus_anatinus', 'url'...
replogle_k562_essential_perturbpredict{'organism': 'homo_sapiens', 'url': 's3://cz-b...
rhesus_macaque_spermatogenesis{'organism': 'macaca_mulatta', 'url': 's3://cz...
tsv2_bladder{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_blood{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_bone_marrow{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_ear{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_eye{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_fat{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_heart{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_large_intestine{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_liver{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_lung{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_lymph_node{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_mammary{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_muscle{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_ovary{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_prostate{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_salivary_gland{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_skin{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_small_intestine{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_spleen{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_stomach{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_testis{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_thymus{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_tongue{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_trachea{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_uterus{'organism': 'homo_sapiens', 'url': 's3://cz-b...
tsv2_vasculature{'organism': 'homo_sapiens', 'url': 's3://cz-b...
\n", "
" ], "text/plain": [ " Dataset\n", "chicken_spermatogenesis {'organism': 'gallus_gallus', 'url': 's3://cz-...\n", "chimpanzee_spermatogenesis {'organism': 'pan_troglodytes', 'url': 's3://c...\n", "gorilla_spermatogenesis {'organism': 'gorilla_gorilla', 'url': 's3://c...\n", "human_spermatogenesis {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "marmoset_spermatogenesis {'organism': 'callithrix_jacchus', 'url': 's3:...\n", "mouse_spermatogenesis {'organism': 'mus_musculus', 'url': 's3://cz-b...\n", "opossum_spermatogenesis {'organism': 'monodelphis_domestica', 'url': '...\n", "platypus_spermatogenesis {'organism': 'ornithorhynchus_anatinus', 'url'...\n", "replogle_k562_essential_perturbpredict {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "rhesus_macaque_spermatogenesis {'organism': 'macaca_mulatta', 'url': 's3://cz...\n", "tsv2_bladder {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_blood {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_bone_marrow {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_ear {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_eye {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_fat {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_heart {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_large_intestine {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_liver {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_lung {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_lymph_node {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_mammary {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_muscle {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_ovary {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_prostate {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_salivary_gland {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_skin {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_small_intestine {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_spleen {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_stomach {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_testis {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_thymus {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_tongue {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_trachea {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_uterus {'organism': 'homo_sapiens', 'url': 's3://cz-b...\n", "tsv2_vasculature {'organism': 'homo_sapiens', 'url': 's3://cz-b..." ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from czbenchmarks.datasets.utils import list_available_datasets\n", "import pandas as pd\n", "\n", "# List all available datasets in czbenchmarks\n", "available_datasets = list_available_datasets()\n", "\n", "# Display available datasets as a table\n", "df_datasets = pd.DataFrame({\"Dataset\": available_datasets})\n", "df_datasets" ] }, { "cell_type": "markdown", "id": "5f9bcfda", "metadata": {}, "source": [ "#### Load a Dataset\n", "\n", "Load the pre-configured `tsv2_prostate dataset`, which you can find in the list above. The library will automatically download, cache, and load this dataset as a SingleCellLabeledDataset object. This makes it easy to reuse the data for your analysis without extra setup.\n", "\n", "**Loaded dataset provides:**\n", "- `dataset.adata`: AnnData object with gene expression data.\n", "- `dataset.labels`: pandas Series of cell type labels." ] }, { "cell_type": "code", "execution_count": 8, "id": "20bb6d6e", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "INFO:czbenchmarks.file_utils:File already exists in cache: /Users/sgupta/.cz-benchmarks/datasets/homo_sapiens_10df7690-6d10-4029-a47e-0f071bb2df83_Prostate_v2_curated.h5ad\n", "INFO:czbenchmarks.datasets.single_cell:Loading dataset from /Users/sgupta/.cz-benchmarks/datasets/homo_sapiens_10df7690-6d10-4029-a47e-0f071bb2df83_Prostate_v2_curated.h5ad in memory mode.\n" ] }, { "data": { "text/plain": [ "AnnData object with n_obs × n_vars = 2044 × 21808\n", " obs: 'donor_id', 'tissue_in_publication', 'anatomical_position', 'method', 'cdna_plate', 'library_plate', 'notes', 'cdna_well', 'assay_ontology_term_id', 'sample_id', 'replicate', '10X_run', 'ambient_removal', 'donor_method', 'donor_assay', 'donor_tissue', 'donor_tissue_assay', 'cell_type_ontology_term_id', 'compartment', 'broad_cell_class', 'free_annotation', 'manually_annotated', 'published_2022', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_ercc', 'pct_counts_ercc', '_scvi_batch', '_scvi_labels', 'scvi_leiden_donorassay_full', 'ethnicity_original', 'sample_number', 'organism_ontology_term_id', 'suspension_type', 'tissue_type', 'tissue_ontology_term_id', 'disease_ontology_term_id', 'is_primary_data', 'sex_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'development_stage_ontology_term_id', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid', 'dataset_id'\n", " var: 'ensembl_id', 'genome', 'mt', 'ercc', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'mean', 'std', 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type', 'feature_id'\n", " uns: '_scvi_manager_uuid', '_scvi_uuid', '_training_mode', 'assay_ontology_term_id_colors', 'citation', 'compartment_colors', 'donor_id_colors', 'leiden', 'method_colors', 'neighbors', 'pca', 'schema_reference', 'schema_version', 'sex_ontology_term_id_colors', 'tissue_in_publication_colors', 'title', 'umap'\n", " obsm: 'X_pca', 'X_scvi', 'X_umap', 'X_umap_scvi_full_donorassay', 'X_uncorrected_alltissues_umap', 'X_uncorrected_umap'\n", " varm: 'PCs'\n", " layers: 'X_original', 'decontXcounts', 'scale_data'\n", " obsp: 'connectivities', 'distances'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The 'dataset' object is a validated AnnData wrapper, ensuring efficient downstream processing.\n", "dataset: SingleCellLabeledDataset = load_dataset(\"tsv2_prostate\")\n", "dataset.adata" ] }, { "cell_type": "markdown", "id": "78a14fb9", "metadata": {}, "source": [ "### 2. Model\n", "\n", "Tasks expect a `CellRepresentation`, which is a `numpy.ndarray` with cells as rows and embedding features as columns. For demonstration, we simulate model output with random data.\n", "\n", "For this example, we will use random numbers to simulate what a real model would produce. In your own work, you should replace this with the actual output from your model—such as the embeddings generated by your neural network or other method.\n", "\n", "---\n", "\n", "> **Tip**: You can copy this notebook and swap out the code below for your own model's import, inference, or training steps. Just make sure the final output is a NumPy array in the correct shape.\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "28b579ad", "metadata": {}, "outputs": [], "source": [ "# Simulated 10-dimensional embedding for each cell\n", "# Replace this with your model's actual code to generate output embeddings for tasks like clustering, embedding, or label prediction.\n", "from czbenchmarks.tasks.types import CellRepresentation\n", "\n", "model_output: CellRepresentation = np.random.rand(dataset.adata.shape[0], 10)" ] }, { "cell_type": "markdown", "id": "fdc0061f", "metadata": {}, "source": [ "\n", "### 3. Task\n", "\n", "Each task defines an evaluation workflow with `run()` and `compute_baseline()` methods.\n", "\n", "| Task Name | Class | Purpose |\n", "|-------------------|-------------------------------|-------------------------------------------------|\n", "| Clustering | `ClusteringTask` | Evaluate cell group separation |\n", "| Embedding Quality | `EmbeddingTask` | Assess embedding structure |\n", "| Label Prediction | `MetadataLabelPredictionTask` | Predict labels from embeddings |\n", "| Batch Integration | `BatchIntegrationTask` | Evaluate batch integration |\n", "| Cross-Species | `CrossSpeciesIntegrationTask` | Integrate data across species |\n", "\n", "#### Task Metrics\n", "\n", "Metrics are managed by `MetricRegistry` and returned as `MetricResult` objects.\n", "\n", "- `MetricType`: Enum of metric names (e.g., `ADJUSTED_RAND_INDEX`, `SILHOUETTE_SCORE`)\n", "- `MetricResult`: Stores metric type, value, and parameters\n", "\n", "All tasks compute and return metrics automatically." ] }, { "cell_type": "markdown", "id": "477d74ef", "metadata": {}, "source": [ "#### Example: Run a Clustering Task\n", "\n", "Evaluate the embedding by measuring clustering performance using Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). The task compares Leiden clusters from the embedding to true labels. Higher scores indicate better clustering.\n", "\n", "Compare `clustering_results` to `clustering_baseline_results` to assess model performance against the PCA baseline." ] }, { "cell_type": "code", "execution_count": 7, "id": "77fb4ec3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--- Clustering Model Results ---\n", "{\n", " \"metric_type\": \"adjusted_rand_index\",\n", " \"value\": -0.00019227160583039173,\n", " \"params\": {}\n", "}\n", "{\n", " \"metric_type\": \"normalized_mutual_info\",\n", " \"value\": 0.022823018925207977,\n", " \"params\": {}\n", "}\n", "\n", "--- Clustering Baseline Results ---\n", "{\n", " \"metric_type\": \"adjusted_rand_index\",\n", " \"value\": 0.6421494136697635,\n", " \"params\": {}\n", "}\n", "{\n", " \"metric_type\": \"normalized_mutual_info\",\n", " \"value\": 0.8331383925676068,\n", " \"params\": {}\n", "}\n" ] } ], "source": [ "from czbenchmarks.tasks import (\n", " ClusteringTask,\n", ")\n", "from czbenchmarks.tasks.clustering import ClusteringTaskInput\n", "\n", "# Evaluate the embedding by measuring clustering performance using Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). The task compares Leiden clusters from the embedding to true labels. Higher scores indicate better clustering. Compare `clustering_results` to `clustering_baseline_results` to assess model performance against the PCA baseline.\n", "\n", "# 1. Initialize the task\n", "clustering_task = ClusteringTask()\n", "\n", "# 2. Define the inputs for the task\n", "clustering_task_input = ClusteringTaskInput(\n", " obs=dataset.adata.obs, # The full observation metadata\n", " input_labels=dataset.labels, # The ground-truth labels for comparison\n", ")\n", "\n", "# 3. Run the task on your model's output\n", "clustering_results = clustering_task.run(\n", " cell_representation=model_output,\n", " task_input=clustering_task_input,\n", ")\n", "\n", "# 4. Compute and run the baseline for comparison\n", "expression_data = dataset.adata.X\n", "clustering_baseline = clustering_task.compute_baseline(expression_data)\n", "clustering_baseline_results = clustering_task.run(\n", " cell_representation=clustering_baseline,\n", " task_input=clustering_task_input,\n", ")\n", "\n", "print(\"--- Clustering Model Results ---\")\n", "for result in clustering_results:\n", " print(result.model_dump_json(indent=2))\n", "\n", "print(\"\\n--- Clustering Baseline Results ---\")\n", "for result in clustering_baseline_results:\n", " print(result.model_dump_json(indent=2))" ] } ], "metadata": { "kernelspec": { "display_name": ".venv_notebooks", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.11" } }, "nbformat": 4, "nbformat_minor": 5 }