czbenchmarks.tasks.base
=======================

.. py:module:: czbenchmarks.tasks.base


Classes
-------

.. autoapisummary::

   czbenchmarks.tasks.base.BaseTask


Module Contents
---------------

.. py:class:: BaseTask

   Bases: :py:obj:`abc.ABC`


   Abstract base class for all benchmark tasks.

   Defines the interface that all tasks must implement. Tasks are responsible for:
   1. Declaring their required input/output data types
   2. Running task-specific computations
   3. Computing evaluation metrics

   Tasks should store any intermediate results as instance variables
   to be used in metric computation.


   .. py:property:: display_name
      :type: str

      :abstractmethod:


      A pretty name to use when displaying task results


   .. py:property:: required_inputs
      :type: Set[czbenchmarks.datasets.DataType]

      :abstractmethod:


      Required input data types this task requires.

      :returns: Set of DataType enums that must be present in input data


   .. py:property:: required_outputs
      :type: Set[czbenchmarks.datasets.DataType]

      :abstractmethod:


      Required output types from models this task requires

      :returns: Set of DataType enums that must be present in output data


   .. py:property:: requires_multiple_datasets
      :type: bool


      Whether this task requires multiple datasets


   .. py:method:: validate(data: czbenchmarks.datasets.BaseDataset)


   .. py:method:: set_baseline(data: czbenchmarks.datasets.BaseDataset, **kwargs)

      Set a baseline embedding using PCA on gene expression data.

      This method performs standard preprocessing on the raw gene expression data
      and uses PCA for dimensionality reduction. It then sets the PCA embedding
      as the BASELINE model output in the dataset, which can be used for comparison
      with other model embeddings.

      :param data: BaseDataset containing AnnData with gene expression data
      :param \*\*kwargs: Additional arguments passed to run_standard_scrna_workflow


   .. py:method:: run(data: Union[czbenchmarks.datasets.BaseDataset, List[czbenchmarks.datasets.BaseDataset]], model_types: Optional[List[czbenchmarks.models.types.ModelType]] = None) -> Union[Dict[czbenchmarks.models.types.ModelType, List[czbenchmarks.metrics.types.MetricResult]], List[Dict[czbenchmarks.models.types.ModelType, List[czbenchmarks.metrics.types.MetricResult]]]]

      Run the task on input data and compute metrics.

      :param data: Single dataset or list of datasets to evaluate. Must contain
                   required input and output data types.

      :returns: Dictionary of model types to metric results
                For multiple datasets: List of metric dictionaries, one per dataset
      :rtype: For single dataset

      :raises ValueError: If data is invalid type or missing required fields
      :raises ValueError: If task requires multiple datasets but single dataset provided