czbenchmarks.datasets.single_cell_labeled
Attributes
Classes
Single cell dataset containing gene expression data and a label column. |
Module Contents
- czbenchmarks.datasets.single_cell_labeled.logger
- class czbenchmarks.datasets.single_cell_labeled.SingleCellLabeledDataset(path: pathlib.Path, organism: czbenchmarks.datasets.types.Organism, label_column_key: str = 'cell_type', task_inputs_dir: pathlib.Path | None = None)[source]
Bases:
czbenchmarks.datasets.single_cell.SingleCellDataset
Single cell dataset containing gene expression data and a label column.
This class extends SingleCellDataset to include a label column that contains the expected prediction values for each cell. The labels are extracted from the specified column in adata.obs and stored as a pd.Series in the labels attribute.
- labels
Extracted labels for each cell.
- Type:
pd.Series
Initialize a SingleCellLabeledDataset instance.
- Parameters:
- labels: pandas.Series
- load_data() None [source]
Load the dataset and extract labels.
This method loads the dataset using the parent class’s load_data method and extracts the labels from the specified column in adata.obs.
- Populates:
labels (pd.Series): Extracted labels for each cell.
- store_task_inputs() pathlib.Path [source]
Store task-specific inputs, such as cell type annotations.
This method stores the extracted labels in a JSON file. The filename is dynamically generated based on the label_column_key.
- Returns:
Path to the directory storing the task input files.
- Return type:
Path