czbenchmarks.tasks.single_cell.perturbation_expression_prediction
=================================================================

.. py:module:: czbenchmarks.tasks.single_cell.perturbation_expression_prediction


Attributes
----------

.. autoapisummary::

   czbenchmarks.tasks.single_cell.perturbation_expression_prediction.logger


Classes
-------

.. autoapisummary::

   czbenchmarks.tasks.single_cell.perturbation_expression_prediction.PerturbationExpressionPredictionTaskInput
   czbenchmarks.tasks.single_cell.perturbation_expression_prediction.PerturbationExpressionPredictionOutput
   czbenchmarks.tasks.single_cell.perturbation_expression_prediction.PerturbationExpressionPredictionTask


Functions
---------

.. autoapisummary::

   czbenchmarks.tasks.single_cell.perturbation_expression_prediction.build_task_input_from_predictions


Module Contents
---------------

.. py:data:: logger

.. py:class:: PerturbationExpressionPredictionTaskInput(/, **data: Any)

   Bases: :py:obj:`czbenchmarks.tasks.task.TaskInput`


   Pydantic model for Perturbation task inputs.

   Dataclass to contain input parameters for the PerturbationExpressionPredictionTask.
   The row and column ordering of the model predictions can optionallybe provided as
   cell_index and gene_index, respectively, so the task can align a model matrix that
   is a subset of or re-ordered relative to the dataset adata.

   Create a new model by parsing and validating input data from keyword arguments.

   Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be
   validated to form a valid model.

   `self` is explicitly positional-only to allow `self` as a field name.


   .. py:attribute:: adata
      :type:  anndata.AnnData


   .. py:attribute:: pred_effect_operation
      :type:  Literal['difference', 'ratio']
      :value: ('ratio',)


   .. py:attribute:: gene_index
      :type:  Optional[pandas.Index]
      :value: None


   .. py:attribute:: cell_index
      :type:  Optional[pandas.Index]
      :value: None


.. py:function:: build_task_input_from_predictions(predictions_adata: anndata.AnnData, dataset_adata: anndata.AnnData, pred_effect_operation: Literal['difference', 'ratio'] = 'ratio') -> PerturbationExpressionPredictionTaskInput

   Create a task input from a predictions AnnData and the dataset AnnData.

   This preserves the predictions' obs/var order so the task can align matrices
   without forcing the caller to reorder arrays.

   :param predictions_adata: The anndata containing model predictions.
   :type predictions_adata: ad.AnnData
   :param dataset_adata: The anndata object from SingleCellPerturbationDataset.
   :type dataset_adata: ad.AnnData
   :param pred_effect_operation: How to compute predicted
                                 effect between treated and control mean predictions over genes. "difference"
                                 uses mean(treated) - mean(control) and is generally safe across scales
                                 (probabilities, z-scores, raw expression). "ratio" uses log((mean(treated)+eps)/(mean(control)+eps))
                                 when means are positive. Default is "ratio".
   :type pred_effect_operation: Literal["difference", "ratio"]
   :param gene_index: The index of the genes in the predictions AnnData.
   :type gene_index: Optional[pd.Index]
   :param cell_index: The index of the cells in the predictions AnnData.
   :type cell_index: Optional[pd.Index]


.. py:class:: PerturbationExpressionPredictionOutput(/, **data: Any)

   Bases: :py:obj:`czbenchmarks.tasks.task.TaskOutput`


   Output for perturbation task.

   Create a new model by parsing and validating input data from keyword arguments.

   Raises [`ValidationError`][pydantic_core.ValidationError] if the input data cannot be
   validated to form a valid model.

   `self` is explicitly positional-only to allow `self` as a field name.


   .. py:attribute:: pred_mean_change_dict
      :type:  Dict[str, numpy.ndarray]


   .. py:attribute:: true_mean_change_dict
      :type:  Dict[str, numpy.ndarray]


.. py:class:: PerturbationExpressionPredictionTask(*, random_seed: int = RANDOM_SEED)

   Bases: :py:obj:`czbenchmarks.tasks.task.Task`


   Abstract base class for all benchmark tasks.

   Defines the interface that all tasks must implement. Tasks are responsible for:
   1. Declaring their required input/output data types
   2. Running task-specific computations
   3. Computing evaluation metrics

   Tasks should store any intermediate results as instance variables
   to be used in metric computation.

   :param random_seed: Random seed for reproducibility
   :type random_seed: int

   **Perturbation Expression Prediction Task.**

   This task evaluates perturbation-induced expression predictions against
   their ground truth values. This is done by calculating metrics derived
   from predicted and ground truth log fold change values for each condition.
   Currently, Spearman rank correlation is supported.

   The following arguments are required and must be supplied by the task input class
   (PerturbationExpressionPredictionTaskInput) when running the task. These parameters
   are described below for documentation purposes:

   - predictions_adata (ad.AnnData):
       The anndata containing model predictions
   - dataset_adata (ad.AnnData):
       The anndata object from SingleCellPerturbationDataset.
   - pred_effect_operation (Literal["difference", "ratio"]):
       How to compute predicted effect between treated and control mean predictions
       over genes.

       * "ratio" uses :math:`\log\left(\frac{\text{mean}(\text{treated}) + \varepsilon}{\text{mean}(\text{control}) + \varepsilon}\right)` when means are positive.

       * "difference" uses :math:`\text{mean}(\text{treated}) - \text{mean}(\text{control})` and is generally safe across scales (probabilities, z-scores, raw expression).

       Default is "ratio".
   - gene_index (Optional[pd.Index]):
       The index of the genes in the predictions AnnData.
   - cell_index (Optional[pd.Index]):
       The index of the cells in the predictions AnnData.

   :param random_seed: Random seed for reproducibility.
   :type random_seed: int

   :returns: dictionary of mean predicted and
             ground truth changes in gene expression values for each condition.
   :rtype: PerturbationExpressionPredictionTask


   .. py:attribute:: display_name
      :value: 'Perturbation Expression Prediction'


   .. py:attribute:: description
      :value: 'Evaluate the quality of predicted changes in expression levels for genes that are...


   .. py:attribute:: input_model


   .. py:attribute:: condition_key
      :value: None


   .. py:method:: compute_baseline(**kwargs)
      :abstractmethod:


      Set a baseline embedding for perturbation expression prediction.

      This method is not implemented for perturbation expression prediction
      tasks.

      :raises NotImplementedError: Always raised as baseline is not implemented