czbenchmarks.tasks.single_cell
Submodules
Classes
Task for evaluating cross-species integration quality. |
|
Task for evaluating perturbation prediction quality. |
Package Contents
- class czbenchmarks.tasks.single_cell.CrossSpeciesIntegrationTask(label_key: str)[source]
Bases:
czbenchmarks.tasks.base.BaseTask
Task for evaluating cross-species integration quality.
This task computes metrics to assess how well different species’ data are integrated in the embedding space while preserving biological signals. It operates on multiple datasets from different species.
- Parameters:
label_key – Key to access ground truth cell type labels in metadata
- label_key
- property required_inputs: Set[czbenchmarks.datasets.DataType]
Required input data types.
- Returns:
Set of required input DataTypes (metadata with labels)
- property required_outputs: Set[czbenchmarks.datasets.DataType]
Required output data types.
- Returns:
required output types from models this task to run (embedding coordinates)
- property requires_multiple_datasets: bool
Whether this task requires multiple datasets.
- Returns:
True as this task compares data across species
- abstract set_baseline(data: List[czbenchmarks.datasets.SingleCellDataset], **kwargs)[source]
Set a baseline embedding for cross-species integration.
This method is not implemented for cross-species integration tasks as standard preprocessing workflows are not directly applicable across different species.
- Parameters:
data – List of SingleCellDataset objects from different species
**kwargs – Additional arguments passed to run_standard_scrna_workflow
- Raises:
NotImplementedError – Always raised as baseline is not implemented
- class czbenchmarks.tasks.single_cell.PerturbationTask[source]
Bases:
czbenchmarks.tasks.base.BaseTask
Task for evaluating perturbation prediction quality.
This task computes metrics to assess how well a model predicts gene expression changes in response to perturbations. Compares predicted vs ground truth perturbation effects using MSE and correlation metrics.
- property required_inputs: Set[czbenchmarks.datasets.DataType]
Required input data types.
- Returns:
Set of required input DataTypes (ground truth perturbation effects)
- property required_outputs: Set[czbenchmarks.datasets.DataType]
Required output data types.
- Returns:
required output types from models this task to run (predicted perturbation effects)
- set_baseline(data: czbenchmarks.datasets.PerturbationSingleCellDataset, gene_pert: str, baseline_type: Literal['median', 'mean'] = 'median', **kwargs)[source]
Set a baseline embedding for perturbation prediction.
Creates baseline predictions using simple statistical methods (median and mean) applied to the control data, and evaluates these predictions against ground truth.
- Parameters:
data – PerturbationSingleCellDataset containing control and perturbed data
gene_pert – The perturbation gene to evaluate
baseline_type – The statistical method to use for baseline prediction (median or mean)
**kwargs – Additional arguments passed to the evaluation
- Returns:
List of MetricResult objects containing baseline performance metrics for different statistical methods (median, mean)