ExpressionMatrix2 Python API

Software for analysis, visualization, and clustering of gene expression data from single-cell RNA sequencing.

This page contains reference information for the Python API. See here for the complete documentation.

CellGraphVertexInfo

class ExpressionMatrix2.CellGraphVertexInfo

This class is used to give the Python code access to information about the vertices of a cell similarity graph. An object of this type contains information for a single vertex.

__init__(self: ExpressionMatrix2.CellGraphVertexInfo) → None

This constructor creates an uninitialized CellGraphVertexInfo object.

cellId

This data member contains the cell id of the cell corresponding to the vertex.

x(self: ExpressionMatrix2.CellGraphVertexInfo) → float

Returns the x coordinate of the vertex in the two-dimensional layout of the graph the vertex belongs to.

y(self: ExpressionMatrix2.CellGraphVertexInfo) → float

Returns the y coordinate of the vertex in the two-dimensional layout of the graph the vertex belongs to.

ExpressionMatrix

class ExpressionMatrix2.ExpressionMatrix

This is the top level class in the ExpressionMatrix2 module. Most high level functionality is provided by this class. Binary data files for an instance of this class are stored in a single directory on disk. They are accessed as memory mapped files.

__init__(self: ExpressionMatrix2.ExpressionMatrix, directoryName: str, allowReadOnly: bool=False) → None

Construct a new expression matrix or access an existing one. The argument is the name of the directory that contains, or will contain, binary data for the ExpressionMatrix object. If the directory does not exist, it will be created, and initialized to a new empty ExpressionMatrix without any genes and cells. If the directory already exists, it must be a directory containing a previously created ExpressionMatrix. In this case, the constructor will access the existing ExpressionMatrix.

addCell(self: ExpressionMatrix2.ExpressionMatrix, metaData: List[Tuple[str, str]], expressionCounts: List[Tuple[str, float]]) → int

Adds a cell to the system. The cell expression counts and meta data are given in Python lists. metadata is a list of tuples with meta data (name, value) pairs. expressionCounts is a list of tuples (geneName, count). See here for an example. Returns the cell id of the cell that was just added. Cell ids begin at zero and increment by one each time a cell is added.

addCellFromJson(self: ExpressionMatrix2.ExpressionMatrix, jsonString: str) → int

Adds a cell to the system. The cell expression counts and meta data are given in a JSON string. See here for information on the expected format of the JSON string. Returns the cell id of the cell that was just added. Cell ids begin at zero and increment by one each time a cell is added.

addCellMetaData(self: ExpressionMatrix2.ExpressionMatrix, cellMetaDataFileName: str) → None

Add cell meta data reading it from a csv file. Each line of the file contains cell meta data values for one cell. Cell meta data names are in the first line. The first column of the file contain cell names. All cell names must already be present.

addCells(self: ExpressionMatrix2.ExpressionMatrix, expressionCountsFileName: str, expressionCountsFileSeparators: str=', ', cellMetaDataFileName: str, cellMetaDataFileSeparators: str=', ', additionalCellMetaData: List[Tuple[str, str]]=[]) → None

Adds cells to the system, reading expression counts and cell meta data from files. See here for more information. .

addCellsFromBioHub1(self: ExpressionMatrix2.ExpressionMatrix, expressionCountsFileName: str, initialMetaDataCount: int, finalMetaDataCount: int, plateMetaDataFileName: str) → None

Add to the system cells created by the BioHub pipeline (July 2017 version, for Illumina data). See here for more information.

addCellsFromBioHub2(self: ExpressionMatrix2.ExpressionMatrix, plateFileName: str, totalExpressionCountThreshold: float) → None

Add to the system cells created by the BioHub pipeline (September 2017 version, for 10X data). See here for more information. This functon is only available in a build that includes HDF5 support.

addCellsFromBioHub3(self: ExpressionMatrix2.ExpressionMatrix, expressionCountsFileName: str, expressionCountsFileSeparators: str=',', plateMetaData: List[Tuple[str, str]]) → None

Add to the system cells created by the BioHub pipeline (November 2017 version, for Illumina data). See here for more information.

addCellsFromHdf5(self: ExpressionMatrix2.ExpressionMatrix, fileName: str, cellNamePrefix: str, cellMetaData: List[Tuple[str, str]], totalExpressionCountThreshold: float) → None

Adds cells to the system, reading expression counts from a file in HDF5 format as created by the 10X Genomics pipeline. See here for more information. This functon is only available in a build that includes HDF5 support.

addGene(self: ExpressionMatrix2.ExpressionMatrix, geneName: str) → bool

Adds a gene with the specified name. Returns True if the gene was added and False if the gene already existed.

analyzeLsh(self: ExpressionMatrix2.ExpressionMatrix, arg0: str, arg1: str, arg2: int, arg3: int, arg4: float) → None

Only intended to be used for testing. See the source code in the ExpressionMatrix2/src directory for more information.

analyzeLshSignatures(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshCount: int=1024, seed: int=231) → None

Only intended to be used for testing. See the source code in the ExpressionMatrix2/src directory for more information.

analyzeSimilarPairs(self: ExpressionMatrix2.ExpressionMatrix, arg0: str, arg1: float) → None

Only intended to be used for testing. See the source code in the ExpressionMatrix2/src directory for more information.

cellCount(self: ExpressionMatrix2.ExpressionMatrix) → int

Returns the total number of cells.

cellIdFromString(self: ExpressionMatrix2.ExpressionMatrix, cellString: str) → int

Returns the cell id corresponding to a given name, or invalidCellId if no cell with the given name exists.

compareSimilarPairs(self: ExpressionMatrix2.ExpressionMatrix, arg0: str, arg1: str) → None

Only intended to be used for testing. See the source code in the ExpressionMatrix2/src directory for more information.

computeCellGraphLayout(self: ExpressionMatrix2.ExpressionMatrix, graphName: str) → None

Computes the two-dimensional layout for the graph with the given name. The graph must have been previously created with a call to ExpressionMatrix2.ExpressionMatrix.createCellGraph(). This function must be called once before the first call to ExpressionMatrix2.ExpressionMatrix.getCellGraphVertices() for the graph.

computeCellSimilarity(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellId0: int, cellId1: int) → float

Returns the similarity between the expression vectors of two cells, specified by their cell ids. The similarity is computed taking into account only genes in the specifified gene set. It is computed as the Pearson correlation coefficient of the expression vectors of the two cells. The computed similarity is always between -1 and 1, and it is rarely negative.

computeLshSignatures(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshName: str, lshCount: int=1024, seed: int=231) → None

Compute cell LSH signatures and store them.

computeMetaDataRandIndex(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str='AllCells', metaDataName0: str, metaDataName1: str) → Tuple[float, float]

Returns the Rand Index and Adjusted Rand Index of the contingency table created using two given meta data fields.

createCellGraph(self: ExpressionMatrix2.ExpressionMatrix, graphName: str, cellSetName: str='AllCells', similarPairsName: str, similarityThreshold: float=0.5, k: int=20, keepIsolatedVertices: bool=False) → None

Create a new cell similarity graph. similarityThreshold is the minimum cell similarity required for an edge to be created. k controls the maximum connectivity of the graph. A cell graph or cell similarity graph is an undirected graph in which each vertex represents a cell. An edge between two vertices is created if the corresponding cells have similarity that exceeds a chosen threshold. In areas of high connectivity, only the edges with the highest similarity for each cell are kept.

createCellSet(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str, cellIds: List[int]) → None

Creates a new cell set cellSetName including the cells with the specified ids.

createCellSetDifference(self: ExpressionMatrix2.ExpressionMatrix, inputSetName0: str, inputSetName1: str, outputSetName: str) → None

Creates a new cell set as the difference between two cell sets. The newly created cell set consists of all cells that are in inputCellSetName0 but not in inputCellSetName1. Returns False if the operation failed because one of the input cell sets does not exist or the output cell set already exists, True otherwise.

createCellSetIntersection(self: ExpressionMatrix2.ExpressionMatrix, inputSetsNames: str, outputSetName: str) → None

Creates a new cell set as the intersection of two or more cell sets. The names of the input cell sets specified as the first argument must be separated by commas. Returns False if the operation failed because one of the input cell sets does not exist or the output cell set already exists, True otherwise.

createCellSetUnion(self: ExpressionMatrix2.ExpressionMatrix, inputSetsNames: str, outputSetName: str) → None

Creates a new cell set as the union of two or more cell sets. The names of the input cell sets specified as the first argument must be separated by commas. Returns False if the operation failed because one of the input cell sets does not exist or the output cell set already exists, True otherwise.

createCellSetUsingMetaData(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str, metaDataFieldName: str, matchString: str, useRegex: bool) → None

Creates a new cell set cellSetName consisting of all cells for which the specified metaDataName matches the given matchTarget. If useRegularExpression is False, this uses a simple string matching. Otherwise, matchTarget is treated as a regular expression to be matched. As a simple application example, suppose all cells contain a meta data name Tissue that gives the type of tissue from which the cell was obtained. Calling createCellSetUsingMetaData with metaDataName=’Tissue’, matchTarget=’Brain’, and useRegularExpression=’False’ will create a new cell set consisting of the cells for which the value of the Tissue meta data field is Brain. Regular expressions can be used to select values that satisfy a variety of criteria.

createCellSetUsingNumericMetaDataBetween(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str, metaDataFieldName: str, lowerBound: float, upperBound: float) → None

Creates a new cell set cellSetName consisting of all cells for which the specified metaDataName is numeric, greater than lowerBound, and less than upperBound.

createCellSetUsingNumericMetaDataGreaterThan(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str, metaDataFieldName: str, lowerBound: float) → None

Creates a new cell set cellSetName consisting of all cells for which the specified metaDataName is numeric and greater than lowerBound.

createCellSetUsingNumericMetaDataLessThan(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str, metaDataFieldName: str, upperBound: float) → None

Creates a new cell set cellSetName consisting of all cells for which the specified metaDataName is numeric and less than upperBound.

createClusterGraph(self: ExpressionMatrix2.ExpressionMatrix, cellGraphName: str, clusterGraphName: str, stableIterationCount: int=3, maxIterationCount: int=100, seed: int=231, minClusterSize: int=100, k: int=3, similarityThreshold: float=0.5, similarityThresholdForMerge: float=0.9) → None

Creates a new cluster graph by running label propagation clustering on an existing cell graph. A cluster graph is an undirected graph in which each vertex represents a cluster found in a cell graph. An edge between two vertices is created if the corresponding clusters have average expression vectors that are sufficiently similar. In areas of high connectivity, only the edges with the highest similarity for each cluster are kept.

createGeneGraph(self: ExpressionMatrix2.ExpressionMatrix, geneGraphName: str, geneSetName: str='AllGenes', similarGenePairsName: str, k: int, similarityThreshold: float) → None

Creates a gene similarity graph.

createGeneSetDifference(self: ExpressionMatrix2.ExpressionMatrix, inputSetName0: str, inputSetName1: str, oututSetName: str) → bool

Creates a new gene set as the difference between two gene sets. The newly created gene set consists of all genes that are in inputGeneSetName0 but not in inputGeneSetName1. Returns False if the operation failed because one of the input gene sets does not exist or the output gene set already exists, True otherwise.

createGeneSetFromGeneNames(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str, geneNames: List[str]) → None

Creates a new gene set given a list of gene names.

createGeneSetIntersection(self: ExpressionMatrix2.ExpressionMatrix, inputSetsNames: str, outputSetName: str) → bool

Creates a new gene set as the intersection of two or more gene sets. The names of the input gene sets specified as the first argument must be separated by commas. Returns False if the operation failed because one of the input gene sets does not exist or the output gene set already exists, True otherwise.

createGeneSetUnion(self: ExpressionMatrix2.ExpressionMatrix, inputSetsNames: str, outputSetName: str) → bool

Creates a new gene set as the union of two or more gene sets. The names of the input gene sets specified as the first argument must be separated by commas. Returns False if the operation failed because one of the input gene sets does not exist or the output gene set already exists, True otherwise.

createGeneSetUsingInformationContent(self: ExpressionMatrix2.ExpressionMatrix, existingGeneSetName: str='AllGenes', cellSetName: str='AllCells', normalizationMethod: ExpressionMatrix2.NormalizationMethod, geneInformationContentThreshold: float, newGeneSetName: str) → None

Creates a new gene set consisting of genes that pass an information content test. The information content, in bits, of each gene in existingGeneSetName is computed using cellSetName and the specified NormalizationMethod. The new gene set newGeneSetName is created, consisting of genes whose computed information content (in bits) exceeds the specified geneInformationContentThreshold. The information content is roughly related to the variability of the expression of that gene in the specified cell set. Genes that are widely expressed have a low information content. Genes that are very selectively expressed have a high information content. A gene that is equally expressed in all cells has zero information content. A gene that is expressed only in one cell has log2N bits of information content, where N is the number of cells in the cell set used to compute information content. A geneInformationContentThreshold of 2 bits is often effective in filtering out widely expressed genes.

createMetaDataFromClusterGraph(self: ExpressionMatrix2.ExpressionMatrix, clusterGraphName: str, metaDataName: str) → None

Creates meta data from the cluster ids stored in a cluster graph by storing the cluster ids in the specified cluster graph into a cell meta data field.

createSignatureGraph(self: ExpressionMatrix2.ExpressionMatrix, signatureGraphName: str, cellSetName: str='AllCells', lshName: str, minCellCount: int) → None

Prototype code.

createWellExpressedGeneSet(self: ExpressionMatrix2.ExpressionMatrix, inputGeneSetName: str='AllGenes', inputCellSetName: str='AllCells', outputGeneSetName: str, minCellCount: int) → None

Create a gene set consisting of genes that are expressed in at least a minimum number of cells. The resulting gene set will contain all of the genes in the specified existing gene set that are expressed in at least the specified minimum number of cells, counting only cells in the specified cell set.

downsampleCellSet(self: ExpressionMatrix2.ExpressionMatrix, inputCellSetName: str='AllCells', newCellSetName: str, probability: float, seed: int) → None

Create a new cell set by downsampling an existing cell set. Return true if successful, false if the input cell set does not exist.

explore(self: ExpressionMatrix2.ExpressionMatrix, port: int=17100, docDirectory: str='', localOnly: bool=False) → None

Starts an http server that can be used, in conjunction with a Web browser, to interact with the ExpressionMatrix object. The localOnly argument can be used to control whether the server will accept connections from everywhere (default) or only from the same machine on which the server is running.

findSimilarGenePairs0(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', normalizationMethod: ExpressionMatrix2.NormalizationMethod=NormalizationMethod.L2, similarGenePairsName: str, k: int=100, similarityThreshold: float=0.2) → None
findSimilarPairs0(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', similarPairsName: str, k: int=100, similarityThreshold: float=0.2) → None

Creates and stores a new object similarPairsName containing pairs of similar cells from cellSetName. The computation is done by directly looping over all possible cell pairs with both cells in cellSetName and performing for each pair an exact computation of the cell similarity, taking into account only genes in geneSetName. For each cell, only the best (most similar) k or fewer similar cells are stored, among those that exceed the specified similarityThreshold. This computational cost of this function grows with the square of the number of cells in cellSetName. The required computing time will typically be a few minutes for a few thousand cells or several hours for a few tens of thousands cells. When the number of cells exceeds a few thousands, it is more practical to perform an approximate computation using findSimilarPairs4.

findSimilarPairs4(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', similarPairsName: str, k: int=100, similarityThreshold: float=0.2, lshCount: int=1024, seed: int=231) → None

Like findSimilarPairs0, but uses Locality-Sensitive Hashing (LSH) with lshCount LSH hashes to speed up the computation of the similar pairs. Like findSimilarPairs0, this still loops over all possible cell pairs with both cells in cellSetName and therefore the computational cost still grows asymptotically with the square of the number of cells. However the computation is orders of magnitudes faster. The computation is approximate, and the error decreases as lshCount increases. For the suggested value lshCount=1024, the standard deviation of the pair similarity computed in this way is 0.05 or less.

findSimilarPairs4Gpu(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshName: str, similarPairsName: str, k: int=100, similarityThreshold: float=0.2, lshCount: int=1024, seed: int=231, kernel: int=1, blockSize: int=16) → None

Only available in a build with GPU functionality enabled.Like findSimilarPairs0, but uses Locality-Sensitive Hashing (LSH) with lshCount LSH hashes to speed up the computation of the similar pairs. Like findSimilarPairs0, this still loops over all possible cell pairs with both cells in cellSetName and therefore the computational cost still grows asymptotically with the square of the number of cells. However the computation is orders of magnitudes faster. The computation is approximate, and the error decreases as lshCount increases. For the suggested value lshCount=1024, the standard deviation of the pair similarity computed in this way is 0.05 or less.

findSimilarPairs5(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshName: str, similarPairsName: str, k: int=100, similarityThreshold: float=0.2, lshSliceLength: int, bucketOverflow: int=1000) → None

LSH-based computation of similar cell pairs without looping over all possible pairs of cells.Prototype code. Use findSimilarPairs4 instead.

findSimilarPairs6(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshName: str, similarPairsName: str, k: int=100, similarityThreshold: float=0.2, permutationCount: int, searchCount: int, permutedBitCount: int=64, seed: int=231) → None

Computation of similar cell pairs using LSH and the Charikar algorithm to avoid looping over all possible pairs of cells.Prototype code, see the code for details. Use findSimilarPairs4 instead.

findSimilarPairs7(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshName: str, similarPairsName: str, k: int=100, similarityThreshold: float=0.2, lshSliceLengths: List[int], maxCheck: int, log2BucketCount: int) → None

LSH-based computation of similar cell pairs without looping over all possible pairs of cells.Prototype code. Use findSimilarPairs4 instead.

findSimilarPairs7Gpu(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', lshName: str, similarPairsName: str, k: int=100, similarityThreshold: float=0.2, lshSliceLengths: List[int], maxCheck: int, log2BucketCount: int, blockSize: int) → None

LSH-based computation of similar cell pairs without looping over all possible pairs of cells.Prototype code. Use findSimilarPairs4 instead.

geneCount(self: ExpressionMatrix2.ExpressionMatrix) → int

Returns the total number of genes.

geneIdFromName(self: ExpressionMatrix2.ExpressionMatrix, geneName: str) → int

Return the numeric id of a gene given its name. Return invalidGeneId if a gene with the given name does not exist.

geneName(self: ExpressionMatrix2.ExpressionMatrix, geneId: int) → str

Return the name of a gene given its numeric id.

getCellExpressionCount(self: ExpressionMatrix2.ExpressionMatrix, cellId: int, geneId: int) → float

Returns the expression count for a given cell and gene. The returned value can be zero.

getCellExpressionCountFromGeneName(self: ExpressionMatrix2.ExpressionMatrix, cellId: int, geneName: str) → float

Returns the expression count for a given cell and gene. The returned value can be zero.

getCellExpressionCounts(self: ExpressionMatrix2.ExpressionMatrix, cellId: int) → List[Tuple[int, float]]

Returns the non-zero expression counts for a given cell. The returned list contains pairs of gene ids and the corresponding expression counts for the given cell.

getCellGraphEdges(self: ExpressionMatrix2.ExpressionMatrix, graphName: str) → List[Tuple[int, int]]

Returns the cell ids corresponding to the two vertices of each edge of the cell graph with the given name. The graph must have been previously created with a call to ExpressionMatrix2.ExpressionMatrix.createCellGraph().

getCellGraphNames(self: ExpressionMatrix2.ExpressionMatrix) → List[str]

Return a list containing the names of all currently defined cell similarity graphs.

getCellGraphVertices(self: ExpressionMatrix2.ExpressionMatrix, graphName: str) → List[ChanZuckerberg::ExpressionMatrix2::CellGraphVertexInfo]

Returns information about the vertices of the cell graph with the given name. The graph must have been previously created with a call to ExpressionMatrix2.ExpressionMatrix.createCellGraph() and its two-dimensional layout must have been computed with a call to ExpressionMatrix2.ExpressionMatrix.computeCellGraphLayout().

getCellMetaData(self: ExpressionMatrix2.ExpressionMatrix, cellId: int) → List[Tuple[str, str]]

Returns all meta data pairs (name, value) for a given cell.

getCellMetaDataValue(self: ExpressionMatrix2.ExpressionMatrix, cellId: int, metaDataName: str) → str

Returns the value of the specified meta data name for the given cell. Returns an empty string if the cell does not have that meta data name. The only meta data field that all cells are guaranteed to have is CellName.

getCellSet(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str) → List[int]

Returns the cell ids of the cells in the cell set with the specified name. If the cell set does not exists, returns an empty container.

getCellSetNames(self: ExpressionMatrix2.ExpressionMatrix) → List[str]

Returns a list containing the names of all currently defined cell sets.

getCellsExpressionCount(self: ExpressionMatrix2.ExpressionMatrix, cellIds: List[int], geneId: int) → List[float]

Returns the expression counts for a set of cells and for a given gene. Some or all of the returned values can be zero.

getCellsExpressionCountFromGeneName(self: ExpressionMatrix2.ExpressionMatrix, cellIds: List[int], geneName: str) → List[float]

Returns the expression counts for a set of cells and for a given gene. Some or all of the returned values can be zero.

getCellsExpressionCounts(self: ExpressionMatrix2.ExpressionMatrix, cellIds: List[int]) → List[List[Tuple[int, float]]]

Returns the non-zero expression counts for a given set of cells. The returned list contains, for each cell, a list of pairs of gene ids and the corresponding expression counts.

getCellsExpressionCountsForGenes(self: ExpressionMatrix2.ExpressionMatrix, cellIds: List[int], geneIds: List[int]) → List[List[Tuple[int, float]]]

Returns the non-zero expression counts for given sets of cells and genes. The returned list contains, for each cell, pairs of gene ids and the corresponding expression counts.

getCellsMetaData(self: ExpressionMatrix2.ExpressionMatrix, cellIds: List[int]) → List[List[Tuple[str, str]]]

Returns all meta data pairs (name, value) for a set of cells specified by list cellIds. Each element in the returned list corresponds to the cell id at the same position in list cellIds.

getClusterAverageExpression(self: ExpressionMatrix2.ExpressionMatrix, clusterGraphName: str, clusterId: int) → List[float]

Return the average, L2-normalized, gene expression of a cluster. The cluster is specified using the name of the cluster graph it belongs to, and its cluster id within that cluster graph. The values returned correspond one on one to the gene ids returned by ExpressionMatrix2.ExpressionMatrix.getClusterGraphGenes() for the same cluster graph. .

getClusterCells(self: ExpressionMatrix2.ExpressionMatrix, clusterGraphName: str, clusterId: int) → List[int]

Returns a list of the cell ids for the cells a cluster. The cluster is specified using the name of the cluster graph it belongs to, and its cluster id within that cluster graph.

getClusterGraphGenes(self: ExpressionMatrix2.ExpressionMatrix, clusterGraphName: str) → List[int]

Returns a list of the ids of the genes used to create a cluster graph.

getClusterGraphVertices(self: ExpressionMatrix2.ExpressionMatrix, clusterGraphName: str) → List[int]

Returns a list of the cluster ids for the vertices of an existing cluster graph. Each vertex of a cluster graph corresponds to a cluster.

getDenseExpressionMatrix(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes', cellSetName: str='AllCells', normalizationMethod: ExpressionMatrix2.NormalizationMethod=NormalizationMethod.none) → array

Get a dense representation of a subset of the expression matrix corresponding to a given gene set and cell set. The returned matrix is a numpy array with row-major memory layout (C-style), with a number of rows equal to the number of cells in the specified cell set, and a number of columns equal to the number of genes in the specified gene set. This means that, because of C-style layout, the returned matrix is indexed in python as m[cellId][geneId], where cellId and geneId are ids local to the cell set and gene set respectively (that is, they only equal global cell ids and gene ids if the function is called for the AllCells and AllGenes sets). Gene sets and cell sets store their genes and cells in order of increasing id.

getGeneGraphConnectivity(self: ExpressionMatrix2.ExpressionMatrix, geneGraphName: str) → List[List[Tuple[int, float]]]

Return the connectivity of the specified gene graph. The return vector is indexed by the local GeneId in the gene set that was used to create the gene graph. It contain pairs (local GeneId, similarity).

getGeneSetGenes(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str='AllGenes') → List[int]

Get the global gene ids of the genes in a gene set.

removeCellMetaData(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str, metaDataName: str) → None

Removes a meta data field for all cells of a given cell set.

removeCellSet(self: ExpressionMatrix2.ExpressionMatrix, cellSetName: str) → None

Removes the cell set with the specified name. If the cell set does not exists, raises an exception.

removeGeneSet(self: ExpressionMatrix2.ExpressionMatrix, geneSetName: str) → None

Removes an existing gene set.

removeSignatureGraph(self: ExpressionMatrix2.ExpressionMatrix, signatureGraphName: str) → None

Prototype code.

removeSimilarPairs(self: ExpressionMatrix2.ExpressionMatrix, similarPairsName: str) → None

Remove the similar cell pairs object with the specified name.

setGeneMetaData(self: ExpressionMatrix2.ExpressionMatrix, geneName: str, name: str, value: str) → None

Set a meta data field for a gene.

testExpressionMatrixSubset(self: ExpressionMatrix2.ExpressionMatrix, arg0: int, arg1: int) → None

For debugging/testing use only.

writeSimilarPairs(self: ExpressionMatrix2.ExpressionMatrix, arg0: str) → None

Only intended to be used for testing. See the source code in the ExpressionMatrix2/src directory for more information. For debugging/testing use only.

NormalizationMethod

class ExpressionMatrix2.NormalizationMethod

Various ways to normalize gene expressions.

  • none: no normalization.
  • L1: L1 normalization (sum of values is 1).
  • L2: L2 normalization (sum of squares of values is 1).
  • Invalid: invalid normalization.

Indices and tables