STRC PE Phase 3 Allele Discrimination

Every PE3b-spanning nicker found in STRC PE Phase2 PAM Expansion was audited for true allele discrimination — does the sgRNA match the EDITED genome and mismatch the MUTANT genome at the variant position? All 35 spanning candidates across 5 Cas9 variants pass letter-match (A on − strand = edited base). They split sharply by mismatch region: only candidates with the variant in the SEED (positions 13–20) give strong discrimination; the rest are distal and largely tolerated by SpCas9. The Phase 2 lead ACTGAAATTGGCACCATAGC is position-5 distal → WEAK discrimination. True strong-discrimination lead for SpCas9 NGG: CCTGAGATCTTCACTGAAAT (PAM TGG, position 17 seed, nick 0.5 nt from edit). With SpG NGN, a balanced compromise exists: TTCACTGAAATTGGCACCAT (position 8 mid-region, 9.5 nt). Phase 2’s conservative framing was wrong; real PE3b discrimination is a SEED-vs-distal distinction, not just “spans the variant.”

Method

  1. Loaded pe_phase2_pam_expansion.json PE3b candidate list per Cas9 variant.
  2. Computed 1-indexed variant position within each 20-nt protospacer (read 5′→3′). For − strand protospacer with + coords [a, b], variant position = (b − 43600551) + 1; + strand: (43600551 − a) + 1.
  3. Classified: SEED (13–20, mismatches strongly block SpCas9; Doench 2016), MID (8–12, moderate), DISTAL (1–7, tolerated).
  4. Checked letter at variant position = A (edited − strand base) vs C (mutant − strand base).
  5. Ranked two ways — aggressive (discrimination-first) and conservative (safety-first: prefers nick 10–80 nt to avoid concurrent-DSB risk).
  6. Reference bases: + strand 43600551 MUT=G / WT=T / EDITED=T; − strand MUT=C / WT=A / EDITED=A; pegRNA edits + strand, so useful PE3b nickers are on − strand.

Results

All useful + discriminating PE3b candidates (− strand nickers, letter A at variant = edited match)

Cas9 variantProtospacer (5′→3′)PAMPos in protoRegionGradeNick-to-edit (nt)
SpCas9 NGGCCTGAGATCTTCACTGAAATTGG17seedstrong0.5
SpCas9 NGGACTGAAATTGGCACCATAGCAGG5distalweak12.5
SpCas9 NGGGAAATTGGCACCATAGCAGGTGG2distalweak15.5
SpCas9 NGGAAATTGGCACCATAGCAGGTGGG1distalweak16.5
SpG NGNCCTGAGATCTTCACTGAAATTGG17seedstrong0.5
SpG NGNCTGAGATCTTCACTGAAATTGGC16seedstrong1.5
SpG NGNTTCACTGAAATTGGCACCATAGC8midmoderate9.5
SpG NGNACTGAAATTGGCACCATAGCAGG5distalweak12.5
SpG NGNCTGAAATTGGCACCATAGCAGGT4distalweak13.5

(Full list with SpRY/SpCas9-NG/enCas9 in pe_phase3_allele_discrimination.json; SaCas9 NNGRRT finds zero PE3b spanners.)

Phase 2 lead re-classified

Phase 2 recommended ACTGAAATTGGCACCATAGC (− strand, AGG PAM, nick 12.5 nt from edit). Phase 3 places its variant position at 5 — DISTAL. Doench et al. 2016 showed SpCas9 tolerates distal single-base mismatches with near-WT activity. Phase 2 lead has weak true discrimination. The “PE3b ON!” framing was correct that discrimination is possible; the specific lead choice was not prioritized by discrimination grade.

Dual ranking

VariantAggressive top (discrimination-first)Conservative top (safe-distance: 10–80 nt)
SpCas9 NGGCCTGAGATCTTCACTGAAAT (seed, 0.5 nt)ACTGAAATTGGCACCATAGC (weak, 12.5 nt)
SpG NGNCCTGAGATCTTCACTGAAAT (seed, 0.5 nt)ACTGAAATTGGCACCATAGC (weak, 12.5 nt)
SpRY NRNCTGAGATCTTCACTGAAATT (seed, 1.5 nt)CACTGAAATTGGCACCATAG (weak, 11.5 nt)

Conservative ranking’s default (distance 10–80 nt) excludes the close-nick seed candidates but also excludes the balanced SpG MID pick. With 10 ≤ d ≤ 80 expanded to 9 ≤ d ≤ 80, the conservative top for SpG becomes TTCACTGAAATTGGCACCAT at 9.5 nt — the true balanced pick (mid region, moderate discrimination, safe distance). The MID candidate is the best single-lead choice if SpG is available.

Interpretation for Misha

  • Aggressive strategy (SpCas9 NGG, off-the-shelf): pegRNA GCCCAGCTCCCCACCTGCTA + PE3b nicker CCTGAGATCTTCACTGAAAT. Seed-position mismatch at variant gives near-complete discrimination, so the 0.5-nt nick distance does NOT drive concurrent-DSB risk — the nicker cannot engage the unedited allele. Trades: close nick sometimes causes unintended MMR fixation toward unedited template; literature for 0–5 nt PE3b nicks is thin.
  • Balanced strategy (SpG NGN, engineered Cas9): pegRNA TGGGGGCCTGAGATCTTCAC + PE3b nicker TTCACTGAAATTGGCACCAT. Position 8 mid-region gives moderate discrimination; 9.5 nt nick is in the usable distance range. Requires SpG enzyme — clinical viability lags slightly behind SpCas9 NGG but is published in multiple in-vivo settings (Walton et al. 2020).
  • Revised OHC efficiency estimate: SpCas9 + aggressive PE3b: 10–30% (discrimination boost offsets 14-nt geometry penalty). SpG + balanced PE3b: 20–40% (PAM-optimal geometry + moderate discrimination + stronger by-position match than Phase 2 assumed).
  • Decision: if a PE-competent lab will work with engineered SpG, the SpG balanced lead is the clinical candidate. If restricted to SpCas9, the aggressive seed-17 nicker is the single best choice — despite the close nick, discrimination is the dominant safety factor.

Limitations

  • Discrimination grade is based on SpCas9 mismatch-tolerance profiles; engineered variants (SpG, SpRY) have broader PAM but similar mismatch sensitivity to SpCas9 (Walton 2020). This assumption is not tested directly here.
  • No off-target scan for any PE3b candidate — Cas-OFFinder run still mandatory (listed under Phase 2 next steps).
  • Close PE3b nicks (<5 nt) are computationally attractive but empirically rare in published PE3b designs; the 0.5-nt SpCas9 lead is a bet on discrimination absolutism. A mouse rescue experiment is required to validate.
  • pegRNA fold integrity for SpG design still not run (ViennaRNA pending).

Next steps

  1. Cas-OFFinder for the three short-listed nickers + both pegRNAs (SpCas9 NGG and SpG NGN spacers).
  2. ViennaRNA fold for the full SpG pegRNA + PE3b nicker pair.
  3. SpG vs SpCas9 decision memo — efficiency × availability × off-target tradeoff. Independent of wet lab.
  4. Move on to next un-modeled hypothesis: mRNA-LNP PK for RBM24-exon-4 skip rescue, or Sonogenetics Phase 3 robustness.

Replication

cd ~/STRC/models
/opt/miniconda3/bin/python3 pe_phase3_allele_discrimination.py
# outputs: pe_phase3_allele_discrimination.json

Files / Models

  • ~/STRC/models/pe_phase3_allele_discrimination.py — full audit, two-way ranking
  • ~/STRC/models/pe_phase3_allele_discrimination.json — per-candidate audit, summary, and ranked lists per Cas9 variant

Connections