czbenchmarks.tasks.clustering
Attributes
Classes
Base class for task inputs. |
|
Output for clustering task. |
|
Task for evaluating clustering performance against ground truth labels. |
Module Contents
- czbenchmarks.tasks.clustering.logger
- class czbenchmarks.tasks.clustering.ClusteringTaskInput(/, **data: Any)[source]
Bases:
czbenchmarks.tasks.task.TaskInputBase class for task inputs.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- obs: Annotated[pandas.DataFrame, Field(description='Cell metadata DataFrame (e.g. the `obs` from an AnnData object).')]
- input_labels: Annotated[czbenchmarks.types.ListLike, Field(description='Ground truth labels for metric calculation (e.g. `obs.cell_type` from an AnnData object).')]
- use_rep: Annotated[str, Field(description="Data representation to use for clustering (e.g. the 'X' or obsm['X_pca'] from an AnnData object).")] = 'X'
- n_iterations: Annotated[int, Field(description='Number of iterations for the Leiden algorithm.')] = 2
- flavor: Annotated[Literal['leidenalg', 'igraph'], Field(description='Algorithm for Leiden community detection.')] = 'igraph'
- class czbenchmarks.tasks.clustering.ClusteringOutput(/, **data: Any)[source]
Bases:
czbenchmarks.tasks.task.TaskOutputOutput for clustering task.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- class czbenchmarks.tasks.clustering.ClusteringTask(*, random_seed: int = RANDOM_SEED)[source]
Bases:
czbenchmarks.tasks.task.TaskTask for evaluating clustering performance against ground truth labels.
This task performs clustering on embeddings and evaluates the results using multiple clustering metrics (ARI and NMI).
- display_name = 'Clustering'
- description = 'Evaluate clustering performance against ground truth labels using ARI and NMI metrics.'
- input_model
- baseline_model