Skip to main content

cellxgene/CellxGene Census 2024

Demo
H. sapiens / M. musculus 1.2 TB CC-BY 4.0 v2024-07 Updated 1 week ago
This is a demo page showing what a dataset detail page looks like on Cyanea. The data shown is illustrative.

Single-cell RNA-seq atlas — 50M+ cells, 600+ datasets from the CZ CELLxGENE project.

The CellxGene Census 2024 is a standardized aggregation of 50+ million single cells from over 600 published datasets. It provides a unified, queryable interface to the largest collection of single-cell RNA-seq data, harmonized with consistent cell type annotations and metadata.

What’s included

  • Gene expression matrices in AnnData (h5ad) and TileDB-SOMA formats
  • Harmonized cell metadata with cell ontology terms and donor information
  • Dataset-level metadata linking back to original publications
  • Pre-computed embeddings (UMAP, PCA) for visualization

Use cases

Ideal for cell type reference mapping, cross-dataset integration, and large-scale meta-analyses of gene expression. The Census API enables efficient slice-and-query access without downloading the full dataset.

Files

census_data/homo_sapiens/ 820 GB
census_data/mus_musculus/ 340 GB
cell_metadata.parquet 2.8 GB
dataset_metadata.parquet 1.2 MB
README.md 7.6 KB

Formats

h5ad TileDB-SOMA Parquet

Tags

single-cell RNA-seq cell atlas transcriptomics