czbenchmarks.tasks.single_cell.cross_species_label_prediction

Attributes

logger

Classes

CrossSpeciesLabelPredictionTaskInput

Base class for task inputs.

CrossSpeciesLabelPredictionOutput

Base class for task outputs.

CrossSpeciesLabelPredictionTask

Task for cross-species label prediction evaluation.

Module Contents

czbenchmarks.tasks.single_cell.cross_species_label_prediction.logger
class czbenchmarks.tasks.single_cell.cross_species_label_prediction.CrossSpeciesLabelPredictionTaskInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskInput

Base class for task inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

labels: List[czbenchmarks.types.ListLike]
organisms: List[czbenchmarks.datasets.types.Organism]
sample_ids: List[czbenchmarks.types.ListLike] | None = None
aggregation_method: Literal['none', 'mean', 'median'] = 'mean'
n_folds: int = 5
class czbenchmarks.tasks.single_cell.cross_species_label_prediction.CrossSpeciesLabelPredictionOutput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskOutput

Base class for task outputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

results: List[Dict[str, Any]]
class czbenchmarks.tasks.single_cell.cross_species_label_prediction.CrossSpeciesLabelPredictionTask(*, random_seed: int = RANDOM_SEED)[source]

Bases: czbenchmarks.tasks.task.Task

Task for cross-species label prediction evaluation.

This task evaluates cross-species transfer by training classifiers on one species and testing on another species. It computes accuracy, F1, precision, recall, and AUROC for multiple classifiers (Logistic Regression, KNN, Random Forest).

The task can optionally aggregate cell-level embeddings to sample/donor level before running classification.

Parameters:

random_seed (int) – Random seed for reproducibility

display_name = 'cross-species label prediction'
requires_multiple_datasets = True
abstract compute_baseline(**kwargs)[source]

Set a baseline for cross-species label prediction.

This method is not implemented for cross-species prediction tasks as standard preprocessing workflows need to be applied per species.

Raises:

NotImplementedError – Always raised as baseline is not implemented