czbenchmarks.tasks.task

Attributes

logger

TASK_REGISTRY

Classes

BaselineInput

Base class for baseline inputs.

PCABaselineInput

Input for the standard PCA baseline workflow.

NoBaselineInput

A model to signify that no baseline is available for a task.

TaskInput

Base class for task inputs.

TaskOutput

Base class for task outputs.

TaskParameter

Schema for a single, discoverable parameter, including help text and list support.

TaskInfo

Schema for all discoverable information about a single benchmark task.

TaskRegistry

Production-grade registry for Task subclasses with comprehensive introspection, validation, and CLI support.

Task

Abstract base class for all benchmark tasks.

Module Contents

czbenchmarks.tasks.task.logger
class czbenchmarks.tasks.task.BaselineInput(/, **data: Any)[source]

Bases: pydantic.BaseModel

Base class for baseline inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class czbenchmarks.tasks.task.PCABaselineInput(/, **data: Any)[source]

Bases: BaselineInput

Input for the standard PCA baseline workflow.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

n_top_genes: int = None
n_pcs: int = None
obsm_key: str = None
class czbenchmarks.tasks.task.NoBaselineInput(/, **data: Any)[source]

Bases: BaselineInput

A model to signify that no baseline is available for a task.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

class czbenchmarks.tasks.task.TaskInput(/, **data: Any)[source]

Bases: pydantic.BaseModel

Base class for task inputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class czbenchmarks.tasks.task.TaskOutput(/, **data: Any)[source]

Bases: pydantic.BaseModel

Base class for task outputs.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class czbenchmarks.tasks.task.TaskParameter(/, **data: Any)[source]

Bases: pydantic.BaseModel

Schema for a single, discoverable parameter, including help text and list support.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

name: str
type: Any
stringified_type: str
default: Any = None
required: bool
help_text: str
is_multiple: bool
model_config

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class czbenchmarks.tasks.task.TaskInfo(/, **data: Any)[source]

Bases: pydantic.BaseModel

Schema for all discoverable information about a single benchmark task.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

name: str
display_name: str
description: str
task_params: Dict[str, TaskParameter]
baseline_params: Dict[str, TaskParameter]
requires_multiple_datasets: bool
class czbenchmarks.tasks.task.TaskRegistry[source]

Production-grade registry for Task subclasses with comprehensive introspection, validation, and CLI support.

This registry provides: - Automatic task discovery and registration - Rich parameter introspection for both Pydantic and function-based tasks - Multi-dataset task validation - CLI-friendly help text generation - Unified validation interface for external programs

register_task(task_class: type[Task]) None[source]

Register a Task class and cache its metadata for efficient access.

Parameters:

task_class – The Task subclass to register

list_tasks() List[str][source]

Return a sorted list of all available task keys.

Returns:

List of task keys that can be used to get task info or classes

get_task_info(task_name: str) TaskInfo[source]

Get all introspected information for a given task.

Parameters:

task_name – The task key (lowercase display name with underscores)

Returns:

TaskInfo object containing all task metadata

Raises:

ValueError – If the task is not found

get_task_class(task_name: str) Type[Task][source]

Get the Task class for a given task name.

Parameters:

task_name – The task key (lowercase display name with underscores)

Returns:

The Task class

Raises:

ValueError – If the task is not found

get_task_help(task_name: str) str[source]

Generate a human-readable summary string of a task’s parameters.

Perfect for CLI help text generation.

Parameters:

task_name – The task key to generate help for

Returns:

Formatted help text string with task description and all parameters

validate_task_inputs(task_name: str, params: Dict[str, Any]) TaskInput | Dict[source]

Validate and build task input parameters.

Returns a Pydantic instance if the task uses Pydantic models, otherwise a dict. Performs comprehensive validation including multi-dataset consistency checks.

Parameters:
  • task_name – The task key

  • params – Dictionary of parameter values

Returns:

Validated TaskInput instance or dict

Raises:

ValueError – If validation fails

validate_baseline_inputs(task_name: str, params: Dict[str, Any]) BaselineInput | Dict[source]

Validate and build baseline input parameters.

Returns a Pydantic instance if the task uses Pydantic models, otherwise a dict. Performs comprehensive validation including multi-dataset consistency checks.

Parameters:
  • task_name – The task key

  • params – Dictionary of parameter values

Returns:

Validated BaselineInput instance or dict

Raises:

ValueError – If validation fails

validate_and_build_inputs(task_name: str, model_type: str, params: Dict[str, Any]) pydantic.BaseModel | Dict[source]

Unified method to validate and build either task or baseline inputs.

This is a convenience method that routes to the appropriate validation method. Useful for external programs that want a single interface.

Parameters:
  • task_name – The task key

  • model_type – Either “input_model” for task inputs or “baseline_model” for baseline inputs

  • params – Dictionary of parameter values

Returns:

Validated model instance or dict

Raises:

ValueError – If validation fails or model_type is invalid

czbenchmarks.tasks.task.TASK_REGISTRY
class czbenchmarks.tasks.task.Task(*, random_seed: int = RANDOM_SEED)[source]

Bases: abc.ABC

Abstract base class for all benchmark tasks.

Defines the interface that all tasks must implement. Tasks are responsible for: 1. Declaring their required input/output data types 2. Running task-specific computations 3. Computing evaluation metrics

Tasks should store any intermediate results as instance variables to be used in metric computation.

input_model: Type[TaskInput]
baseline_model: Type[BaselineInput]
random_seed = 42
requires_multiple_datasets = False
classmethod __init_subclass__(**kwargs)[source]

Automatically register task subclasses when they are defined.

compute_baseline(expression_data: czbenchmarks.tasks.types.CellRepresentation, baseline_input: PCABaselineInput = None) czbenchmarks.tasks.types.CellRepresentation[source]

Set a baseline embedding using PCA on gene expression data.

run(cell_representation: czbenchmarks.tasks.types.CellRepresentation | List[czbenchmarks.tasks.types.CellRepresentation], task_input: TaskInput) List[czbenchmarks.metrics.types.MetricResult][source]

Run the task on input data and compute metrics.

Parameters:
  • cell_representation – gene expression data or embedding to use for the task

  • task_input – Pydantic model with inputs for the task

Returns:

A one-element list containing a single metric result for the task For multiple embeddings: List of metric results for each task, one per dataset

Return type:

For single embedding

Raises:

ValueError – If input does not match multiple embedding requirement