Design Overview

cz-benchmarks is designed with modularity and reproducibility in mind. Its core components include:

  • Datasets:

    Manage input data (AnnData objects, metadata) and ensure data integrity through type checking with custom DataType definitions. Images are supported in the future. See Datasets for more details.

  • Models:

    Models are packaged in Docker containers and follow the BaseModelImplementation interface. Each model is checked for correctness using dedicated validator classes. For more information, see Models.

  • Tasks:

    Define evaluation operations such as clustering, embedding evaluation, label prediction, and perturbation assessment. Tasks extend the BaseTask class and serve as blueprints for benchmarking. See Tasks for more details.

  • Metrics:

    A central MetricRegistry handles the registration and computation of metrics, enabling consistent and reusable evaluation criteria. See Metrics for more details.

  • Runner:

    Orchestrates the workflow by handling containerized execution, automatic serialization, and seamless integration of datasets, models, and tasks.

  • Configuration Management:

    Uses Hydra and OmegaConf to dynamically compose configurations for datasets, models, and tasks.

Key Design Concepts

  • Declarative Configuration: Use Hydra and OmegaConf to centralize and manage configuration for datasets, models, and tasks.

  • Loose Coupling: Components communicate through well-defined interfaces. This minimizes dependencies and makes testing easier.

  • Validation and Type Safety: Custom type definitions in the datasets and validators enforce that the data and model outputs meet expected standards.

Class Diagrams

        classDiagram
  ABC <|-- BaseDataset
  BaseDataset <|-- SingleCellDataset
  Enum <|-- DataType
  Enum <|-- Organism
  SingleCellDataset <|-- PerturbationSingleCellDataset
    
        classDiagram
  ABC <|-- BaseModelImplementation
  ABC <|-- BaseModelValidator
  BaseModelValidator <|-- BaseModelImplementation
  BaseModelValidator <|-- BaseSingleCellValidator
    
        classDiagram
  BaseTask <|-- BatchIntegrationTask
  BaseTask <|-- ClusteringTask
  BaseTask <|-- CrossSpeciesIntegrationTask
  BaseTask <|-- EmbeddingTask
  BaseTask <|-- MetadataLabelPredictionTask
  BaseTask <|-- PerturbationTask
    
        classDiagram
  BaseModel <|-- MetricInfo
  BaseModel <|-- MetricResult
  Enum <|-- MetricType