cellxgene_census.experimental.pp.highly_variable_genes
- cellxgene_census.experimental.pp.highly_variable_genes(query: ExperimentAxisQuery, n_top_genes: int = 1000, layer: str = 'raw', flavor: Literal['seurat_v3'] = 'seurat_v3', span: float = 0.3, batch_key: str | Sequence[str] | None = None, max_loess_jitter: float = 1e-06, batch_key_func: Callable[[...], Any] | None = None) DataFrame
Identify and annotate highly variable genes contained in the query results. The API is modelled on ScanPy scanpy.pp.highly_variable_genes API. Results returned will mimic ScanPy results. The only flavor available is the Seurat V3 method, which assumes count data in the X layer.
See https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.highly_variable_genes.html#scanpy.pp.highly_variable_genes for more information on this method.
- Parameters:
query – A
tiledbsoma.ExperimentAxisQuery
, specifying theobs
/var
selection over which genes are annotated.n_top_genes – Number of genes to rank.
layer – X layer used, e.g.,
"raw"
.flavor – Method used to annotate genes. Must be
"seurat_v3"
.span – If
flavor="seurat_v3"
, the fraction of obs/cells used to estimate the LOESS variance model fit.batch_key – If specified, gene selection will be done by batch and combined. Specify the obs column name, or list of column names, identifying the batches. If not specified, all gene selection is done as a single batch. If multiple batch keys are specified, and no batch_key_func is specified, the batch key will be generated by converting values to string and concatenating them.
max_lowess_jitter – The maximum jitter to add to data in case of LOESS failure (can occur when dataset has low entry counts.)
batch_key_func – Optional function to create a user-defined batch key. Function will be called once per row in the obs dataframe. Function will receive a single argument: a
pandas.Series
containing values specified in the``batch_key`` argument.
- Returns:
A
pandas.DataFrame
containing annotations for allvar
values specified by thequery
argument. Annotations are identical to those produced byscanpy.pp.highly_variable_genes()
.- Raises:
ValueError – if the flavor parameter is not
"seurat_v3"
.
Examples
Fetch
pandas.DataFrame
containing var annotations for the query selection, using"dataset_id"
asbatch_key
.>>> hvg = highly_variable_genes(query, batch_key="dataset_id")
Fetch highly variable genes, using the concatenation of
"dataset_id"
and"donor_id"
asbatch_key
:>>> hvg = highly_variable_genes(query, batch_key=["dataset_id", "donor_id"])
Fetch highly variable genes, with a user-defined
batch_key_func
:>>> hvg = highly_variable_genes( query, batch_key="donor_id", batch_key_func=lambda s: return "batch0" if s.donor_id == "99" else "batch1" )
Lifecycle
experimental