cellxgene/CellxGene Census 2024
DemoSingle-cell RNA-seq atlas — 50M+ cells, 600+ datasets from the CZ CELLxGENE project.
The CellxGene Census 2024 is a standardized aggregation of 50+ million single cells from over 600 published datasets. It provides a unified, queryable interface to the largest collection of single-cell RNA-seq data, harmonized with consistent cell type annotations and metadata.
What’s included
- Gene expression matrices in AnnData (h5ad) and TileDB-SOMA formats
- Harmonized cell metadata with cell ontology terms and donor information
- Dataset-level metadata linking back to original publications
- Pre-computed embeddings (UMAP, PCA) for visualization
Use cases
Ideal for cell type reference mapping, cross-dataset integration, and large-scale meta-analyses of gene expression. The Census API enables efficient slice-and-query access without downloading the full dataset.
Files
census_data/homo_sapiens/
820 GB
census_data/mus_musculus/
340 GB
cell_metadata.parquet
2.8 GB
dataset_metadata.parquet
1.2 MB
README.md
7.6 KB