czbenchmarks.tasks.label_prediction

Attributes

logger

Classes

MetadataLabelPredictionTaskInput

Pydantic model for MetadataLabelPredictionTask inputs.

MetadataLabelPredictionOutput

Output for label prediction task.

LabelPredictionBaselineInput

This baseline uses the raw gene expression matrix as features.

MetadataLabelPredictionTask

Task for predicting labels from embeddings using cross-validation.

Module Contents

czbenchmarks.tasks.label_prediction.logger
class czbenchmarks.tasks.label_prediction.MetadataLabelPredictionTaskInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskInput

Pydantic model for MetadataLabelPredictionTask inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

labels: Annotated[czbenchmarks.types.ListLike, Field(description='Ground truth labels for prediction (e.g. `obs.cell_type` from an AnnData object)')]
n_folds: Annotated[int, Field(description='Number of folds for stratified cross-validation.')] = 5
min_class_size: Annotated[int, Field(description='Minimum number of samples required for a class to be included in evaluation.')] = 10
class czbenchmarks.tasks.label_prediction.MetadataLabelPredictionOutput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.TaskOutput

Output for label prediction task.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

results: List[Dict[str, Any]]
class czbenchmarks.tasks.label_prediction.LabelPredictionBaselineInput(/, **data: Any)[source]

Bases: czbenchmarks.tasks.task.BaselineInput

This baseline uses the raw gene expression matrix as features. It has no configurable parameters.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

class czbenchmarks.tasks.label_prediction.MetadataLabelPredictionTask(*, random_seed: int = RANDOM_SEED)[source]

Bases: czbenchmarks.tasks.task.Task

Task for predicting labels from embeddings using cross-validation.

Evaluates multiple classifiers (Logistic Regression, KNN) using k-fold cross-validation. Reports standard classification metrics.

display_name = 'Label Prediction'
description = 'Predict labels from embeddings using cross-validated classifiers and standard metrics.'
input_model
baseline_model
compute_baseline(expression_data: czbenchmarks.tasks.types.CellRepresentation, baseline_input: LabelPredictionBaselineInput = None) czbenchmarks.tasks.types.CellRepresentation[source]

Set a baseline cell representation using raw gene expression.

This baseline uses the raw gene expression matrix directly as features.