gnomAD (Genome Aggregation Database)
Largest public database of human genetic variation. Contains allele frequencies from 807,162 individuals (v4.1). Essential for PM2 (absence) evidence in ACMG classification.
What It Does
- Population allele frequencies across diverse ancestries
- Coverage statistics per position
- Constraint metrics (pLI, LOEUF, missense Z) per gene
- Loss-of-function variant counts
- Structural variants (SVs)
How to Use
Web
- Go to https://gnomad.broadinstitute.org
- Search gene “STRC” or variant “15-43600551-A-C”
- Check allele frequency, ancestry breakdown, and constraint
API
# GraphQL API
curl -X POST https://gnomad.broadinstitute.org/api \
-H "Content-Type: application/json" \
-d '{"query": "{ gene(gene_symbol: \"STRC\", reference_genome: GRCh38) { gnomad_constraint { pLI oe_lof } } }"}'Python (Hail)
import hail as hl
# Load gnomAD dataset (requires Hail setup)
gnomad = hl.read_table("gs://gcp-public-data--gnomad/release/4.1/ht/genomes/gnomad.genomes.v4.1.sites.ht")Verified Status
VERIFIED — STRC E1659A (c.4976A>C) is ABSENT from gnomAD (0/251,000+ alleles). This confirms PM2_Supporting evidence.
STRC Research Usage
- STRC E1659A Conservation and Reclassification — PM2 evidence (absent)
- STRC Variant c.4976A>C — Misha — population frequency check
- STRC gene constraint: moderate (some LoF variants exist — consistent with recessive disease)
Critical Notes
- STRC has known issues in gnomAD due to pseudogene (STRCP1) read mismapping
- Coverage may be low at STRC locus — check coverage statistics
- Structural variants — the 98kb deletion may or may not be in gnomAD SVs
Results (April 2026)
- STRC constraint scores: pLI = 0.14 (tolerant to LoF — consistent with recessive disease), missense Z-score = 7.62 (HIGHLY constrained for missense). This means STRC tolerates loss-of-function (one broken copy is fine = recessive) but missense variants are under strong selection.
- Still untapped: ancestry-specific frequencies, SV database search for 98kb deletion
Connections
- ClinVar [see-also] — variant classification
- seqr [see-also] — rare disease analysis (uses gnomAD)
- STRC E1659A Conservation and Reclassification [used-in]
- STRC Pseudogene Problem [see-also] — affects gnomAD read mapping