czbenchmarks.datasets.validators

Submodules

Classes

DatasetValidator

Abstract base class for dataset validators. Not used in code and provided as convenience to validate user datasets.

SingleCellLabeledValidator

Base validator for single-cell labeled datasets.

Package Contents

class czbenchmarks.datasets.validators.DatasetValidator[source]

Bases: abc.ABC

Abstract base class for dataset validators. Not used in code and provided as convenience to validate user datasets.

Defines the interface for validating datasets against dataset requirements. Validators ensure datasets meet dataset-specific requirements like: - Compatible data types - Organism compatibility - Feature name formats

Each validator must: 1. Define a dataset_type class variable 2. Implement _validate_dataset, inputs, and outputs as abstract methods/properties

dataset_type: ClassVar[Type[czbenchmarks.datasets.Dataset]]
classmethod __init_subclass__() None[source]

Validate that subclasses define required class variables.

Raises:

TypeError – If required class variables are missing

validate_dataset(dataset: czbenchmarks.datasets.Dataset)[source]

Validate that a dataset meets all requirements.

Checks: 1. Dataset type matches dataset_type 2. Runs dataset specific validation

Parameters:

dataset – Dataset to validate

Raises:

ValueError – If validation fails

class czbenchmarks.datasets.validators.SingleCellLabeledValidator[source]

Bases: czbenchmarks.datasets.validators.dataset_validator.DatasetValidator

Base validator for single-cell labeled datasets.

Provides validation logic for single-cell labeled datasets, including: - Checking if the dataset organism is supported - Validating presence of required observation and variable keys in AnnData

dataset_type: ClassVar[type]
available_organisms: ClassVar[List[czbenchmarks.datasets.Organism]]
required_obs_keys: ClassVar[List[str]]
required_var_keys: ClassVar[List[str]]
classmethod __init_subclass__() None[source]

Ensure required class variables are defined in subclasses.

Subclasses must define: - available_organisms - required_obs_keys - required_var_keys

Raises:

TypeError – If any required class variable is missing in the subclass