{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exploring pre-calculated summary cell counts\n", "\n", "This tutorial describes how to access pre-calculated summary cell counts. Each Census contains a top-level dataframe summarizing counts of various cell labels, this is the `census_summary_cell_counts` dataframe . You can read this into a Pandas DataFrame\n", "\n", "**Contents**\n", "\n", "1. Fetching the `census_summary_cell_counts` dataframe.\n", "2. Creating summary counts beyond pre-calculated values.\n", "\n", "⚠️ Note that the Census RNA data includes duplicate cells present across multiple datasets. Duplicate cells can be filtered in or out using the cell metadata variable `is_primary_data` which is described in the [Census schema](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md#repeated-data).\n", "\n", "## Fetching the `census_summary_cell_counts` dataframe" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2023-07-28T16:17:28.143432Z", "iopub.status.busy": "2023-07-28T16:17:28.143007Z", "iopub.status.idle": "2023-07-28T16:17:31.207795Z", "shell.execute_reply": "2023-07-28T16:17:31.207159Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "The \"stable\" release is currently 2023-07-25. Specify 'census_version=\"2023-07-25\"' in future calls to open_soma() to ensure data consistency.\n" ] }, { "data": { "text/html": [ "
\n", " | organism | \n", "category | \n", "ontology_term_id | \n", "unique_cell_count | \n", "total_cell_count | \n", "label | \n", "
---|---|---|---|---|---|---|
0 | \n", "Homo sapiens | \n", "all | \n", "na | \n", "33364242 | \n", "56400873 | \n", "na | \n", "
1 | \n", "Homo sapiens | \n", "assay | \n", "EFO:0008722 | \n", "264166 | \n", "279635 | \n", "Drop-seq | \n", "
2 | \n", "Homo sapiens | \n", "assay | \n", "EFO:0008780 | \n", "25652 | \n", "51304 | \n", "inDrop | \n", "
3 | \n", "Homo sapiens | \n", "assay | \n", "EFO:0008919 | \n", "89477 | \n", "206754 | \n", "Seq-Well | \n", "
4 | \n", "Homo sapiens | \n", "assay | \n", "EFO:0008931 | \n", "78750 | \n", "188248 | \n", "Smart-seq2 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
1357 | \n", "Mus musculus | \n", "tissue_general | \n", "UBERON:0002113 | \n", "179684 | \n", "208324 | \n", "kidney | \n", "
1358 | \n", "Mus musculus | \n", "tissue_general | \n", "UBERON:0002365 | \n", "15577 | \n", "31154 | \n", "exocrine gland | \n", "
1359 | \n", "Mus musculus | \n", "tissue_general | \n", "UBERON:0002367 | \n", "37715 | \n", "130135 | \n", "prostate gland | \n", "
1360 | \n", "Mus musculus | \n", "tissue_general | \n", "UBERON:0002368 | \n", "13322 | \n", "26644 | \n", "endocrine gland | \n", "
1361 | \n", "Mus musculus | \n", "tissue_general | \n", "UBERON:0002371 | \n", "90225 | \n", "144962 | \n", "bone marrow | \n", "
1362 rows × 6 columns
\n", "\n", " | cell_type_ontology_term_id | \n", "cell_type | \n", "size | \n", "
---|---|---|---|
0 | \n", "CL:0000001 | \n", "primary cultured cell | \n", "80 | \n", "
1 | \n", "CL:0000003 | \n", "native cell | \n", "1308000 | \n", "
2 | \n", "CL:0000006 | \n", "neuronal receptor cell | \n", "2502 | \n", "
3 | \n", "CL:0000015 | \n", "male germ cell | \n", "621 | \n", "
4 | \n", "CL:0000019 | \n", "sperm | \n", "22 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
608 | \n", "CL:4028006 | \n", "alveolar type 2 fibroblast cell | \n", "38250 | \n", "
609 | \n", "CL:4030009 | \n", "epithelial cell of proximal tubule segment 1 | \n", "777 | \n", "
610 | \n", "CL:4030011 | \n", "epithelial cell of proximal tubule segment 3 | \n", "989 | \n", "
611 | \n", "CL:4030018 | \n", "kidney connecting tubule principal cell | \n", "107 | \n", "
612 | \n", "CL:4030023 | \n", "respiratory hillock cell | \n", "10170 | \n", "
613 rows × 3 columns
\n", "