czbenchmarks.tasks.single_cell.cross_species_label_prediction

Attributes

logger

Classes

`CrossSpeciesLabelPredictionTaskInput`	Pydantic model for CrossSpeciesLabelPredictionTask inputs.
`CrossSpeciesLabelPredictionOutput`	Base class for task outputs.
`CrossSpeciesLabelPredictionTask`	Task for cross-species label prediction evaluation.

Module Contents

czbenchmarks.tasks.single_cell.cross_species_label_prediction.logger

class czbenchmarks.tasks.single_cell.cross_species_label_prediction.CrossSpeciesLabelPredictionTaskInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskInput

Pydantic model for CrossSpeciesLabelPredictionTask inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

labels: Annotated[List[czbenchmarks.types.ListLike], Field(description='List of ground truth labels for each species dataset (e.g., cell types).')]

organisms: Annotated[List[czbenchmarks.datasets.types.Organism], Field(description='List of organisms corresponding to each dataset for cross-species evaluation.')]

sample_ids: Annotated[List[czbenchmarks.types.ListLike] | None, Field(description='Optional list of sample/donor IDs for aggregation, one per dataset.')] = None

aggregation_method: Annotated[Literal['none', 'mean', 'median'], Field(description="Method to aggregate cells with the same sample_id ('none', 'mean', or 'median').")] = 'mean'

n_folds: Annotated[int, Field(description='Number of cross-validation folds for intra-species evaluation.')] = 5

class czbenchmarks.tasks.single_cell.cross_species_label_prediction.CrossSpeciesLabelPredictionOutput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskOutput

Base class for task outputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

results: List[Dict[str, Any]]

class czbenchmarks.tasks.single_cell.cross_species_label_prediction.CrossSpeciesLabelPredictionTask(*, random_seed: int = RANDOM_SEED)[source]

Bases: czbenchmarks.tasks.task.Task

Task for cross-species label prediction evaluation.

This task evaluates cross-species transfer by training classifiers on one species and testing on another species. It computes accuracy, F1, precision, recall, and AUROC for multiple classifiers (Logistic Regression, KNN, Random Forest).

The task can optionally aggregate cell-level embeddings to sample/donor level before running classification.

display_name = 'cross-species label prediction'

description = 'Evaluate cross-species label prediction performance using multiple classifiers.'

input_model

baseline_model

requires_multiple_datasets = True

abstract compute_baseline(expression_data: czbenchmarks.tasks.types.CellRepresentation, baseline_input: czbenchmarks.tasks.task.NoBaselineInput = None)[source]

Set a baseline for cross-species label prediction.

Not implemented as standard preprocessing needs to be applied per species.