❗ R API in beta.
CZ CELLxGENE Discover Census¶
The Census provides efficient computational tooling to access, query, and analyze all single-cell RNA data from CZ CELLxGENE Discover. Using a new access paradigm of cell-based slicing and querying, you can interact with the data through TileDB-SOMA, or get slices in AnnData or Seurat objects, thus accelerating your research by significantly minimizing data harmonization.
Coming soon: R tutorials.
Citing the Census¶
Please follow the citation guidelines offered by CZ CELLxGENE Discover.
The Census is a data object publicly hosted online and an API to open it. The object is built using the SOMA API specification and data model, and it is implemented via TileDB-SOMA. As such, the Census has all the data capabilities offered by TileDB-SOMA including:
Data access at scale
Cloud-based data access.
Efficient access for larger-than-memory slices of data.
Query and access data based on cell or gene metadata at low latency.
Interoperability with existing single-cell toolkits
Interoperability with existing Python or R data structures
Census Data and Schema¶
A description of the Census data and its schema is detailed here.
⚠️ Note that the data includes:
Full-gene sequencing read counts (e.g. Smart-Seq2) and molecule counts (e.g. 10X).
Duplicate cells present across multiple datasets, these can be filtered in or out using the cell metadata variable
Census Data Releases¶
The Census data release plans are detailed here.
Starting May 15th, 2023, Census data releases with long-term support will be published every six months. These releases will be publicly accessible for at least five years. In addition, weekly releases may be published without any guarantee of permanence.
Questions, Feedback and Issues¶
Users are encouraged to submit questions and feature requests about the Census via github issues.
For quick support, you can join the CZI Science Community on Slack (czi.co/science-slack) and ask questions in the
Users are encouraged to share their feedback by emailing email@example.com.
Bugs can be submitted via github issues.
If you believe you have found a security issue, please disclose it by contacting firstname.lastname@example.org.
Additional FAQs can be found here.
We are currently working on creating the tooling necessary to perform data modeling at scale with seamless integration of the Census and PyTorch.
To increase the usability of the Census for research, in 2023 and 2024 we are planning to explore the following areas:
Include organism-wide normalized layers.
Include organism-wide embeddings.
On-demand information-rich subsampling.
Projects and Tools Using Census¶
If you are interested in listing a project here, please reach out to us at email@example.com