czbenchmarks.tasks.single_cell

Submodules

Classes

`CrossSpeciesIntegrationOutput`	Output for cross-species integration task.
`CrossSpeciesIntegrationTask`	Task for evaluating cross-species integration quality.
`CrossSpeciesIntegrationTaskInput`	Pydantic model for CrossSpeciesIntegrationTask inputs.
`CrossSpeciesLabelPredictionTaskInput`	Base class for task inputs.
`CrossSpeciesLabelPredictionOutput`	Base class for task outputs.
`CrossSpeciesLabelPredictionTask`	Task for cross-species label prediction evaluation.
`PerturbationExpressionPredictionOutput`	Output for perturbation task.
`PerturbationExpressionPredictionTask`	Abstract base class for all benchmark tasks.
`PerturbationExpressionPredictionTaskInput`	Pydantic model for Perturbation task inputs.

Package Contents

class czbenchmarks.tasks.single_cell.CrossSpeciesIntegrationOutput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskOutput

Output for cross-species integration task.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

cell_representation: czbenchmarks.tasks.types.CellRepresentation

labels: czbenchmarks.types.ListLike

species: czbenchmarks.types.ListLike

class czbenchmarks.tasks.single_cell.CrossSpeciesIntegrationTask(*, random_seed: int = RANDOM_SEED)[source]

Bases: czbenchmarks.tasks.task.Task

Task for evaluating cross-species integration quality.

This task computes metrics to assess how well different species’ data are integrated in the embedding space while preserving biological signals. It operates on multiple datasets from different species.

Parameters:: random_seed (int) – Random seed for reproducibility

display_name = 'Cross-species Integration'

description = 'Evaluate cross-species integration quality using various integration metrics.'

input_model

requires_multiple_datasets = True

abstract compute_baseline(**kwargs)[source]

Set a baseline embedding for cross-species integration.

This method is not implemented for cross-species integration tasks as standard preprocessing workflows are not directly applicable across different species.

Raises:: NotImplementedError – Always raised as baseline is not implemented

class czbenchmarks.tasks.single_cell.CrossSpeciesIntegrationTaskInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskInput

Pydantic model for CrossSpeciesIntegrationTask inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

labels: List[czbenchmarks.types.ListLike]

organism_list: List[czbenchmarks.datasets.types.Organism]

class czbenchmarks.tasks.single_cell.CrossSpeciesLabelPredictionTaskInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskInput

Base class for task inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

labels: List[czbenchmarks.types.ListLike]

organisms: List[czbenchmarks.datasets.types.Organism]

sample_ids: List[czbenchmarks.types.ListLike] | None = None

aggregation_method: Literal['none', 'mean', 'median'] = 'mean'

n_folds: int = 5

class czbenchmarks.tasks.single_cell.CrossSpeciesLabelPredictionOutput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskOutput

Base class for task outputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

results: List[Dict[str, Any]]

class czbenchmarks.tasks.single_cell.CrossSpeciesLabelPredictionTask(*, random_seed: int = RANDOM_SEED)[source]

Bases: czbenchmarks.tasks.task.Task

Task for cross-species label prediction evaluation.

This task evaluates cross-species transfer by training classifiers on one species and testing on another species. It computes accuracy, F1, precision, recall, and AUROC for multiple classifiers (Logistic Regression, KNN, Random Forest).

The task can optionally aggregate cell-level embeddings to sample/donor level before running classification.

Parameters:: random_seed (int) – Random seed for reproducibility

display_name = 'cross-species label prediction'

requires_multiple_datasets = True

abstract compute_baseline(**kwargs)[source]

Set a baseline for cross-species label prediction.

This method is not implemented for cross-species prediction tasks as standard preprocessing workflows need to be applied per species.

Raises:: NotImplementedError – Always raised as baseline is not implemented

class czbenchmarks.tasks.single_cell.PerturbationExpressionPredictionOutput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskOutput

Output for perturbation task.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

pred_mean_change_dict: Dict[str, numpy.ndarray]

true_mean_change_dict: Dict[str, numpy.ndarray]

class czbenchmarks.tasks.single_cell.PerturbationExpressionPredictionTask(*, random_seed: int = RANDOM_SEED)[source]

Bases: czbenchmarks.tasks.task.Task

Abstract base class for all benchmark tasks.

Defines the interface that all tasks must implement. Tasks are responsible for: 1. Declaring their required input/output data types 2. Running task-specific computations 3. Computing evaluation metrics

Tasks should store any intermediate results as instance variables to be used in metric computation.

Parameters:: random_seed (int) – Random seed for reproducibility

Perturbation Expression Prediction Task.

This task evaluates perturbation-induced expression predictions against their ground truth values. This is done by calculating metrics derived from predicted and ground truth log fold change values for each condition. Currently, Spearman rank correlation is supported.

The following arguments are required and must be supplied by the task input class (PerturbationExpressionPredictionTaskInput) when running the task. These parameters are described below for documentation purposes:

predictions_adata (ad.AnnData):
The anndata containing model predictions
dataset_adata (ad.AnnData):
The anndata object from SingleCellPerturbationDataset.
pred_effect_operation (Literal[“difference”, “ratio”]):
How to compute predicted effect between treated and control mean predictions over genes.
- “ratio” uses \(\log\left(\frac{\text{mean}(\text{treated}) + \varepsilon}{\text{mean}(\text{control}) + \varepsilon}\right)\) when means are positive.
- “difference” uses \(\text{mean}(\text{treated}) - \text{mean}(\text{control})\) and is generally safe across scales (probabilities, z-scores, raw expression).
Default is “ratio”.
gene_index (Optional[pd.Index]):
The index of the genes in the predictions AnnData.
cell_index (Optional[pd.Index]):
The index of the cells in the predictions AnnData.

Parameters:: random_seed (int) – Random seed for reproducibility.
Returns:: dictionary of mean predicted and ground truth changes in gene expression values for each condition.
Return type:: PerturbationExpressionPredictionTask

display_name = 'Perturbation Expression Prediction'

description = 'Evaluate the quality of predicted changes in expression levels for genes that are...

input_model

condition_key = None

abstract compute_baseline(**kwargs)[source]

Set a baseline embedding for perturbation expression prediction.

This method is not implemented for perturbation expression prediction tasks.

Raises:: NotImplementedError – Always raised as baseline is not implemented

class czbenchmarks.tasks.single_cell.PerturbationExpressionPredictionTaskInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskInput

Pydantic model for Perturbation task inputs.

Dataclass to contain input parameters for the PerturbationExpressionPredictionTask. The row and column ordering of the model predictions can optionallybe provided as cell_index and gene_index, respectively, so the task can align a model matrix that is a subset of or re-ordered relative to the dataset adata.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

adata: anndata.AnnData

pred_effect_operation: Literal['difference', 'ratio'] = ('ratio',)

gene_index: pandas.Index | None = None

cell_index: pandas.Index | None = None