{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exploring the Census Datasets table\n", "\n", "This tutorial demonstrates basic use of the `census_datasets` dataframe that contains metadata of the Census source datasets. This metadata can be joined to the cell metadata dataframe (`obs`) via the column `dataset_id`, \n", "\n", "**Contents**\n", "\n", "1. Fetching the datasets table.\n", "2. Fetching the expression data from a single dataset.\n", "3. Downloading the original source H5AD file of a dataset.\n", "\n", "⚠️ Note that the Census RNA data includes duplicate cells present across multiple datasets. Duplicate cells can be filtered in or out using the cell metadata variable `is_primary_data` which is described in the [Census schema](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md#repeated-data).\n", "\n", "## Fetching the datasets table\n", "\n", "\n", "Each Census contains a top-level dataframe itemizing the datasets contained therein. You can read this into a `pandas.DataFrame`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2023-07-28T13:51:38.383599Z", "iopub.status.busy": "2023-07-28T13:51:38.383335Z", "iopub.status.idle": "2023-07-28T13:51:41.392248Z", "shell.execute_reply": "2023-07-28T13:51:41.391544Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "The \"stable\" release is currently 2023-07-25. Specify 'census_version=\"2023-07-25\"' in future calls to open_soma() to ensure data consistency.\n" ] }, { "data": { "text/html": [ "
\n", " | collection_id | \n", "collection_name | \n", "collection_doi | \n", "dataset_id | \n", "dataset_title | \n", "dataset_h5ad_path | \n", "dataset_total_cell_count | \n", "
---|---|---|---|---|---|---|---|
soma_joinid | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "e2c257e7-6f79-487c-b81c-39451cd4ab3c | \n", "Spatial multiomics map of trophoblast developm... | \n", "10.1038/s41586-023-05869-0 | \n", "f171db61-e57e-4535-a06a-35d8b6ef8f2b | \n", "donor_p13_trophoblasts | \n", "f171db61-e57e-4535-a06a-35d8b6ef8f2b.h5ad | \n", "31497 | \n", "
1 | \n", "e2c257e7-6f79-487c-b81c-39451cd4ab3c | \n", "Spatial multiomics map of trophoblast developm... | \n", "10.1038/s41586-023-05869-0 | \n", "ecf2e08e-2032-4a9e-b466-b65b395f4a02 | \n", "All donors trophoblasts | \n", "ecf2e08e-2032-4a9e-b466-b65b395f4a02.h5ad | \n", "67070 | \n", "
2 | \n", "e2c257e7-6f79-487c-b81c-39451cd4ab3c | \n", "Spatial multiomics map of trophoblast developm... | \n", "10.1038/s41586-023-05869-0 | \n", "74cff64f-9da9-4b2a-9b3b-8a04a1598040 | \n", "All donors all cell states (in vivo) | \n", "74cff64f-9da9-4b2a-9b3b-8a04a1598040.h5ad | \n", "286326 | \n", "
3 | \n", "f7cecffa-00b4-4560-a29a-8ad626b8ee08 | \n", "Mapping single-cell transcriptomes in the intr... | \n", "10.1016/j.ccell.2022.11.001 | \n", "5af90777-6760-4003-9dba-8f945fec6fdf | \n", "Single-cell transcriptomic datasets of Renal c... | \n", "5af90777-6760-4003-9dba-8f945fec6fdf.h5ad | \n", "270855 | \n", "
4 | \n", "3f50314f-bdc9-40c6-8e4a-b0901ebfbe4c | \n", "Single-cell sequencing links multiregional imm... | \n", "10.1016/j.ccell.2021.03.007 | \n", "bd65a70f-b274-4133-b9dd-0d1431b6af34 | \n", "Single-cell sequencing links multiregional imm... | \n", "bd65a70f-b274-4133-b9dd-0d1431b6af34.h5ad | \n", "167283 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
588 | \n", "180bff9c-c8a5-4539-b13b-ddbc00d643e6 | \n", "Molecular characterization of selectively vuln... | \n", "10.1038/s41593-020-00764-7 | \n", "f9ad5649-f372-43e1-a3a8-423383e5a8a2 | \n", "Molecular characterization of selectively vuln... | \n", "f9ad5649-f372-43e1-a3a8-423383e5a8a2.h5ad | \n", "8168 | \n", "
589 | \n", "a72afd53-ab92-4511-88da-252fb0e26b9a | \n", "Single-cell atlas of peripheral immune respons... | \n", "10.1038/s41591-020-0944-y | \n", "456e8b9b-f872-488b-871d-94534090a865 | \n", "Single-cell atlas of peripheral immune respons... | \n", "456e8b9b-f872-488b-871d-94534090a865.h5ad | \n", "44721 | \n", "
590 | \n", "38833785-fac5-48fd-944a-0f62a4c23ed1 | \n", "Construction of a human cell landscape at sing... | \n", "10.1038/s41586-020-2157-4 | \n", "2adb1f8a-a6b1-4909-8ee8-484814e2d4bf | \n", "Construction of a human cell landscape at sing... | \n", "2adb1f8a-a6b1-4909-8ee8-484814e2d4bf.h5ad | \n", "598266 | \n", "
591 | \n", "5d445965-6f1a-4b68-ba3a-b8f765155d3a | \n", "A molecular cell atlas of the human lung from ... | \n", "10.1038/s41586-020-2922-4 | \n", "e04daea4-4412-45b5-989e-76a9be070a89 | \n", "Krasnow Lab Human Lung Cell Atlas, Smart-seq2 | \n", "e04daea4-4412-45b5-989e-76a9be070a89.h5ad | \n", "9409 | \n", "
592 | \n", "5d445965-6f1a-4b68-ba3a-b8f765155d3a | \n", "A molecular cell atlas of the human lung from ... | \n", "10.1038/s41586-020-2922-4 | \n", "8c42cfd0-0b0a-46d5-910c-fc833d83c45e | \n", "Krasnow Lab Human Lung Cell Atlas, 10X | \n", "8c42cfd0-0b0a-46d5-910c-fc833d83c45e.h5ad | \n", "65662 | \n", "
593 rows × 7 columns
\n", "\n", " | collection_id | \n", "collection_name | \n", "collection_doi | \n", "dataset_id | \n", "dataset_title | \n", "dataset_h5ad_path | \n", "dataset_total_cell_count | \n", "
---|---|---|---|---|---|---|---|
soma_joinid | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
522 | \n", "0b9d8a04-bb9d-44da-aa27-705bb65b54eb | \n", "Tabula Muris Senis | \n", "10.1038/s41586-020-2496-1 | \n", "0bd1a1de-3aee-40e0-b2ec-86c7a30c7149 | \n", "Bone marrow - A single-cell transcriptomic atl... | \n", "0bd1a1de-3aee-40e0-b2ec-86c7a30c7149.h5ad | \n", "40220 | \n", "