{ "cells": [ { "cell_type": "markdown", "id": "88812eae-6b46-48b4-a1e4-c468657d8480", "metadata": {}, "source": [ "# Generating citations for Census slices\n", "\n", "This notebook demonstrates how to generate a citation string for all datasets contained in a Census slice.\n", "\n", "**Contents**\n", "\n", "1. Requirements\n", "1. Generating citation strings\n", " 1. Via cell metadata query\n", " 1. Via an AnnData query \n", "\n", "⚠️ Note that the Census RNA data includes duplicate cells present across multiple datasets. Duplicate cells can be filtered in or out using the cell metadata variable `is_primary_data` which is described in the [Census schema](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md#repeated-data).\n", "\n", "## Requirements\n", "\n", "This notebook requires:\n", "\n", "- `cellxgene_census` Python package.\n", "- Census data release with [schema version](https://github.com/chanzuckerberg/cellxgene-census/blob/main/docs/cellxgene_census_schema.md) 1.3.0 or greater.\n", "\n", "## Generating citation strings\n", "\n", "First we open a handle to the Census data. To ensure we open a data release with schema version 1.3.0 or greater, we use `census_version=\"latest\"`" ] }, { "cell_type": "code", "execution_count": 1, "id": "9a5a5a92-3d78-4542-95a5-e6889f245491", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | soma_joinid | \n", "label | \n", "value | \n", "
---|---|---|---|
0 | \n", "0 | \n", "census_schema_version | \n", "1.3.0 | \n", "
1 | \n", "1 | \n", "census_build_date | \n", "2024-01-01 | \n", "
2 | \n", "2 | \n", "dataset_schema_version | \n", "4.0.0 | \n", "
3 | \n", "3 | \n", "total_cell_count | \n", "75694072 | \n", "
4 | \n", "4 | \n", "unique_cell_count | \n", "45846761 | \n", "
5 | \n", "5 | \n", "number_donors_homo_sapiens | \n", "16292 | \n", "
6 | \n", "6 | \n", "number_donors_mus_musculus | \n", "2153 | \n", "