Skip to main content

broad-institute/gnomAD v4.1

Demo
H. sapiens 12.4 TB CC0 v4.1 Updated 2 days ago
This is a demo page showing what a dataset detail page looks like on Cyanea. The data shown is illustrative.

Genome aggregation database — 807k exomes, 76k genomes with variant frequencies across diverse populations.

The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators that aggregates and harmonizes both exome and genome data from a wide range of large-scale sequencing projects. This release spans 807,162 exomes and 76,215 genomes from unrelated individuals sequenced as part of various disease-specific and population genetic studies.

What’s included

  • Variant frequencies across 8 major population groups
  • Loss-of-function constraint metrics for every protein-coding gene
  • Structural variants from short-read WGS
  • Mitochondrial variant calls from all available samples

Use cases

gnomAD is widely used for variant filtering in rare-disease diagnostics, estimating carrier frequencies for recessive conditions, and benchmarking variant-calling pipelines. The constraint metrics are a standard tool for gene-level interpretation in clinical genomics.

Files

gnomad.exomes.v4.1.sites.vcf.bgz 58.3 GB
gnomad.genomes.v4.1.sites.vcf.bgz 742 GB
gnomad.v4.1.coverage.summary.tsv.bgz 12.1 GB
gnomad.v4.1.constraint_metrics.tsv 4.2 MB
README.md 8.4 KB

Formats

VCF TSV Hail MatrixTable

Tags

population genetics variant frequency exome genome gnomAD