Base by Base

Gustavo Barra

Base by Base explores advances in genetics and genomics, with a focus on gene-disease associations, variant interpretation, protein structure, and insights from exome and genome sequencing. Each episode breaks down key studies and their clinical relevance—one base at a time. Powered by AI, Base by Base offers a new way to learn on the go. Special thanks to authors who publish under CC BY 4.0, making open-access science faster to share and easier to explore.

  1. 336: Measuring disease likelihood in genomic ascertainment

    1 DAY AGO

    336: Measuring disease likelihood in genomic ascertainment

    Sapp JC et al., The American Journal of Human Genetics - A longitudinal study of recipients of medically actionable secondary genomic findings develops a Bayesian approach that integrates variant, family genotypic, and phenotypic data to estimate the probability that a secondary finding represents a true clinicomolecular diagnosis, with a detailed analysis of BRCA1/BRCA2 families and implications for screening policy and clinical management. Key terms: secondary findings, BRCA1, BRCA2, Bayesian risk assessment, population genomic screening. Study Highlights: The team enrolled 227 secondary findings recipients and completed genotyping and deep phenotyping for 163 probands, using cascade testing and variant reclassification. They piloted a Bayesian method combining prior population prevalence, variant pathogenicity, and family genotype–phenotype data to estimate clinicomolecular diagnosis (CMD) probabilities for BRCA1/2 families. CMD probabilities varied widely (26.2% to >99.9%) and over half of BRCA1/2 families met NCCN diagnostic testing criteria, indicating underuse of diagnostic testing. Conclusion: In opportunistic secondary findings contexts the posterior probability that a patient has the implicated monogenic disease can differ substantially from variant pathogenicity; integrating familial genotypic and phenotypic data via Bayesian methods refines risk estimates and should guide shared decision-making, management strategies, and policy for population genomic screening. Music: Enjoy the music based on this article at the end of the episode. Article title: Measuring disease likelihood in genomic ascertainment First author: Sapp JC Journal: The American Journal of Human Genetics DOI: 10.1016/j.ajhg.2026.03.009 Reference: Sapp JC, Lewis KL, Modlin EW, et al. Measuring disease likelihood in genomic ascertainment. The American Journal of Human Genetics. 2026;113:1–12. doi:10.1016/j.ajhg.2026.03.009 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/measuring-disease-likelihood-genomic-ascertainment QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-04-07. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Audited transcript sections describing the Bayesian CMD approach, the BRCA1/BRCA2 findings, the Family 8334 case, NCCN criteria implications, and study design/limitations. - transcript topics: ACMG secondary findings context and selection bias; Bayesian probability model for CMD; Cascade testing and family data integration; BRCA1 vs BRCA2 variant distribution and penetrance; NCCN criteria and clinical testing underutilization; Study design and recruitment (163 probands from 41 sources) QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 5 - claims flagged for review: 0 - metadata checks passed: 4 - metadata issues found: 0 Metadata Audited: - doi - article_title - article_journal - license Factual Items Audited: - CMD probability range across BRCA1/BRCA2 families: 26.2% to 100% - Baseline posterior probability for BRCA2-related CMD: 58.2% - Posterior CMD probability for family 8334: 99.2% - Average CMD probabi...

    24 min
  2. 335: Altai Neandertal Genome Reveals Deep Population Structure

    3 DAYS AGO

    335: Altai Neandertal Genome Reveals Deep Population Structure

    Massilania D et al., PNAS - We summarize a PNAS study reporting a ~37× genome from a ~110,000-year-old male Neandertal (Denisova 17) from Denisova Cave. The genome shows D17 is closely related to an earlier Denisova Neandertal (D5), both carry Denisovan introgressed segments, and Neandertal groups displayed high regional differentiation and small, isolated populations in the Altai. Key terms: Neandertal, Denisova Cave, genome sequencing, population structure, Denisovan admixture. Study Highlights: The authors generated a high-coverage (~37-fold) autosomal genome from a ~110,000-year-old male Neandertal (D17) from Denisova Cave and dated it to ~110 kya. D17 is more closely related to an older Denisova Neandertal (D5) than to European or other Altai Neandertals and both D5 and D17 contain Denisovan-derived genomic segments. Patterns of homozygosity indicate smaller, more isolated groups in Altai Neandertals compared with later European Neandertals. Estimated FST shows Eastern and Western Neandertals were as genetically differentiated as the most divergent present-day human populations, implying rapid drift under small effective sizes. Conclusion: A high-coverage Altai Neandertal genome reveals Denisovan admixture in older eastern Neandertals, small and isolated group sizes in the Altai, and pronounced east–west Neandertal population differentiation exceeding that seen among modern human populations. Music: Enjoy the music based on this article at the end of the episode. Article title: A high-coverage Neandertal genome from the Altai Mountains reveals population structure among Neandertals First author: Massilania D Journal: PNAS DOI: 10.1073/pnas.2534576123 Reference: Massilania D, Peyrégne S, Iasi LN M, de Filippo C, Mafessoni F, Mesab AB, Sümer AP, Swiel Y, Popli D, Silverman S, Boylea MJ, Kozlikind MB, Shunkov MV, Derevianko AP, Higham T, Douka K, Meyer M, Zeberg H, Kelso J, Pääbo S. A high-coverage Neandertal genome from the Altai Mountains reveals population structure among Neandertals. PNAS. 2026;123(13):e2534576123. doi:10.1073/pnas.2534576123 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/d17-altai-neandertal-genome-structure QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-04-05. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Substantive audit focused on the transcript sections describing: specimen, sequencing coverage, population structure, Denisovan admixture, autozygosity and small group sizes, FST differentiation, and dating of D17/D5 lineages, plus migration/replacement dynamics. - transcript topics: Denisova 17 (D17) DNA extraction and high-coverage genome (~37x); Relationship among Neandertals (D17, D5, Chag8, Vi33.19) and Denisovans; Denisovan introgression into D17 and D5; lack of clear Denisovan signal in Chag8; Autozygosity and small population sizes (50 individuals) in Eastern Neandertals; Genetic differentiation (FST ~0.30) between Eastern and Western Neandertals; Molecular dating and age estimates for D17 (~110 kya) and Y-chromosome lineage QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 6 - claims flagged for review: 0 - metadata c...

    24 min
  3. 334: LINE-1 Recombination with Diverse RNAs

    3 DAYS AGO

    334: LINE-1 Recombination with Diverse RNAs

    Law C-T et al., Cell Genomics - Law and Burns introduce TiMEstamp, a comparative-genomics pipeline that dates LINE-1 insertions from multiple sequence alignments and discovers hundreds of LINE-1 chimeric insertions fused to diverse RNAs across mammalian evolution. Key terms: LINE-1, TiMEstamp, chimeric insertions, retrotransposition, comparative genomics. Study Highlights: The authors developed TiMEstamp to infer TE insertion times from multispecies MSAs and to detect contemporaneous 5′ sequences fused to LINE-1. They compiled a large catalog of LINE-1 chimeras (reported >700 events) including known U6/LINE-1 cases and newly identified partners such as tRNA, 28S rRNA, 7SL, Y RNA, Alu elements, and mRNA 5′ transductions. Alu/LINE-1 chimeras (452 events) and 17 mRNA/lncRNA 5′ transductions were characterized with TSDs, EN motifs, and orientation/length patterns. They also show that promoter co-option (e.g., RAP1GDS1 driving a spliced intronic L1PA2) can restore retrotransposition competence. Conclusion: Comparative MSA-based timing reveals widespread, recurrent recombination between LINE-1 RNA and diverse cellular RNAs, producing chimeric insertions that have contributed to transposon diversification and provide mechanisms (RNA ligation, template switching, twin priming, promoter co-option) that may influence TE evolution and somatic/germline retrotransposition. Music: Enjoy the music based on this article at the end of the episode. Article title: Comparative genomics reveals LINE-1 recombination with diverse RNAs First author: Law C-T Journal: Cell Genomics DOI: 10.1016/j.xgen.2026.101165 Reference: Law C-T and Burns K.H., 2026. Comparative genomics reveals LINE-1 recombination with diverse RNAs. Cell Genomics 6, 101165. https://doi.org/10.1016/j.xgen.2026.101165 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/line1-recombination-diverse-rnas QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-04-05. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Audited sections covering the TiMEstamp workflow and data (MSA across mammals), discovery of chimeric LINE-1 insertions (tRNA halves, 28S rRNA, 7SL, Y RNA, 7SK), Alu/LINE-1 chimeras, mRNA/lncRNA 5' transductions (MAP3K13, FHIT), RAP1GDS1 promoter co-option, twin priming and trans-splicing mechanisms, and study limitati - transcript topics: LINE-1 retrotransposition mechanics (TPRT); TiMEstamp methodology and MSA dating; Chimeric LINE-1 insertions with non-LINE-1 RNAs (tRNA halves, 28S rRNA, 7SL, Y RNA, 7SK RNA); Alu/LINE-1 chimeras and temporal activity; 5′ transductions involving mRNAs/lncRNAs (MAP3K13, FHIT, RAP1GDS1); RAP1GDS1 promoter co-option and transcriptional rescue QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 7 - claims flagged for review: 0 - metadata checks passed: 5 - metadata issues found: 0 Metadata Audited: - article_doi - article_title - article_journal - license - episode_title Factual Items Audited: - TiMEstamp uses MSAs across mammals to date LINE-1 insertions and identify contemporaneous adjacencies - Chimeric LINE-1 insertions involve dive...

    22 min
  4. 333: Holistic determination of cfDNA ends

    4 DAYS AGO

    333: Holistic determination of cfDNA ends

    Jiang P et al., Cell Genomics - This episode reviews a Cell Genomics study that uses ssDNA '2-end' and novel '4-end' sequencing to profile native 5′ and 3′ termini of plasma cfDNA. The work identifies PREM/POEM markers, links 3′ ends to methylation, and shows improved HCC detection. Key terms: cfDNA fragmentomics, 3' end motifs, 4-end sequencing, hepatocellular carcinoma, DNASE1L3. Study Highlights: The authors adapted single-stranded library preparation (2-end sequencing) to measure native 5′ (EM5) and 3′ (EM3) end motifs and defined flanking PREM and POEM motifs. Combining size‑stratified PREM, EM5, EM3, and POEM features raised hepatocellular carcinoma (HCC) detection to an AUC of 0.95. Fragmentomics-based methylation analysis of 3′ ends (3′ FRAGMA) improved HCC detection further (AUC 0.97). A novel 4-end sequencing approach captured all four termini of double‑stranded cfDNA, yielding 4‑end motif models with AUC up to 0.98 and revealing coordinated nuclease activity, notably DNASE1L3 involvement. Conclusion: Holistic end profiling of cfDNA—integrating native 5′ and 3′ ends, flanking motifs, methylation-informed fragmentomics, and four‑end resolution—enhances cancer detection performance and provides mechanistic insight into coordinated nuclease-mediated fragmentation, warranting larger validation studies. Music: Enjoy the music based on this article at the end of the episode. Article title: Holistic determination of ends of cfDNA molecules First author: Jiang P Journal: Cell Genomics DOI: 10.1016/j.xgen.2026.101142 Reference: Jiang P., Ma M.-J. L., Qiao R., et al. Holistic determination of ends of cfDNA molecules. Cell Genomics. 2026;6:101142. doi:10.1016/j.xgen.2026.101142 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/holistic-cfdna-ends-episode-333 QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-04-04. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Audited the transcript’s substantive description of 2-end sequencing (EM5/EM3), PREM/POEM, 3'-FRAGMA, 4-end sequencing, nuclease signatures (DNASE1L3, DNASE1, DFFB), HCC diagnostic performance (AUC values), fragment-size context, and translational limitations as reported in the article. - transcript topics: 2-end sequencing preserving native ends (EM5/EM3); Pre-end (PREM) and post-end motifs (POEM); 4-end sequencing with stem-loop adapters and PacBio SMRT; 3'-FRAGMA methylation analysis; Nuclease-specific end motifs and coordinated fragmentation (DNASE1L3, DNASE1, DFFB); HCC detection performance and AUC values QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 8 - claims flagged for review: 0 - metadata checks passed: 4 - metadata issues found: 0 Metadata Audited: - article_doi - article_title - article_journal - license - episode_title - episode_number - season Factual Items Audited: - 2-end sequencing preserves native 5' and 3' ends (EM5/EM3) by omitting end-repair during library prep - PREM and POEM motifs are defined and analyzed as end-motif neighbors around EM5 and EM3 - 4-end sequencing enables simultaneous assessment of all four ter...

    25 min
  5. 4 DAYS AGO

    333: Holistic determination of cfDNA ends

    Jiang P et al., Cell Genomics - This episode reviews a Cell Genomics study that uses ssDNA '2-end' and novel '4-end' sequencing to profile native 5′ and 3′ termini of plasma cfDNA. The work identifies PREM/POEM markers, links 3′ ends to methylation, and shows improved HCC detection. Key terms: cfDNA fragmentomics, 3' end motifs, 4-end sequencing, hepatocellular carcinoma, DNASE1L3. Study Highlights: The authors adapted single-stranded library preparation (2-end sequencing) to measure native 5′ (EM5) and 3′ (EM3) end motifs and defined flanking PREM and POEM motifs. Combining size‑stratified PREM, EM5, EM3, and POEM features raised hepatocellular carcinoma (HCC) detection to an AUC of 0.95. Fragmentomics-based methylation analysis of 3′ ends (3′ FRAGMA) improved HCC detection further (AUC 0.97). A novel 4-end sequencing approach captured all four termini of double‑stranded cfDNA, yielding 4‑end motif models with AUC up to 0.98 and revealing coordinated nuclease activity, notably DNASE1L3 involvement. Conclusion: Holistic end profiling of cfDNA—integrating native 5′ and 3′ ends, flanking motifs, methylation-informed fragmentomics, and four‑end resolution—enhances cancer detection performance and provides mechanistic insight into coordinated nuclease-mediated fragmentation, warranting larger validation studies. Music: Enjoy the music based on this article at the end of the episode. Article title: Holistic determination of ends of cfDNA molecules First author: Jiang P Journal: Cell Genomics DOI: 10.1016/j.xgen.2026.101142 Reference: Jiang P., Ma M.-J. L., Qiao R., et al. Holistic determination of ends of cfDNA molecules. Cell Genomics. 2026;6:101142. doi:10.1016/j.xgen.2026.101142 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/holistic-cfdna-ends-episode-333 QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-04-04. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Audited the transcript’s substantive description of 2-end sequencing (EM5/EM3), PREM/POEM, 3'-FRAGMA, 4-end sequencing, nuclease signatures (DNASE1L3, DNASE1, DFFB), HCC diagnostic performance (AUC values), fragment-size context, and translational limitations as reported in the article. - transcript topics: 2-end sequencing preserving native ends (EM5/EM3); Pre-end (PREM) and post-end motifs (POEM); 4-end sequencing with stem-loop adapters and PacBio SMRT; 3'-FRAGMA methylation analysis; Nuclease-specific end motifs and coordinated fragmentation (DNASE1L3, DNASE1, DFFB); HCC detection performance and AUC values QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 8 - claims flagged for review: 0 - metadata checks passed: 4 - metadata issues found: 0 Metadata Audited: - article_doi - article_title - article_journal - license - episode_title - episode_number - season Factual Items Audited: - 2-end sequencing preserves native 5' and 3' ends (EM5/EM3) by omitting end-repair during library prep - PREM and POEM motifs are defined and analyzed as end-motif neighbors around EM5 and EM3 - 4-end sequencing enables simultaneous assessment of all four ter...

  6. 332: When Chromatin Filters Force: Age, AP-1, and Fibroblast Mechanotransduction

    6 DAYS AGO

    332: When Chromatin Filters Force: Age, AP-1, and Fibroblast Mechanotransduction

    Liao Y et al., PNAS - Human dermal fibroblasts from young and old donors were embedded in 3D collagen and exposed to mechanical tension and TGF-β. Combining bulk RNA‑seq, ATAC‑seq, imaging, and perturbations, the study shows that matrix tension amplifies TGF‑β responses in young but not aged cells and identifies AP‑1 as a central chromatin-associated regulator required for fibroblast activation. Key terms: chromatin accessibility, mechanotransduction, aging, TGF-β signaling, AP-1. Study Highlights: Young fibroblasts under matrix tension mount a strong, synergistic transcriptional response to TGF‑β while aged fibroblasts exhibit blunted or divergent responses. Age-dependent differences in chromatin accessibility, notably at distal regulatory elements, correlate with these transcriptional outcomes. AP‑1 family motifs are highly enriched in TGF‑β- and age-responsive accessible regions and cooperate with age-specific TFs. Inhibiting AP‑1 activity prevents JUNB recruitment to RNA polymerase II and suppresses myofibroblast activation. Conclusion: 3D chromatin accessibility acts as a dynamic filter of mechanical and biochemical signals during aging; AP‑1 and its regulatory network drive the age-specific chromatin remodeling that permits synergistic tension + TGF‑β responses in young fibroblasts, and AP‑1 inhibition blocks this activation, suggesting a potential therapeutic axis. Music: Enjoy the music based on this article at the end of the episode. Article title: Chromatin accessibility regulates age- dependent nuclear mechanotransduction First author: Liao Y Journal: PNAS DOI: 10.1073/pnas.2522217123 Reference: Liao Y, Land M, Gupta R, Yu L, Sornapudi TR, Shivashankar GV. Chromatin accessibility regulates age-dependent nuclear mechanotransduction. PNAS. 2026;123(13):e2522217123. doi:10.1073/pnas.2522217123. Published March 26, 2026. License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) – https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/base-by-base-332-chromatin-age-mechanotransduction QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-04-02. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Substantive auditing covered the study’s central mechanistic story: aging as a chromatin-filter for mechanochemical signaling; AP-1 as master regulator; age-specific TF partnerships; the 3D collagen tension model; ATAC-seq/RNA-seq integration; and AP-1 perturbation experiments. - transcript topics: Aging as chromatin-based signaling filter; 3D collagen gel tension model with glass ring; TGF-β signaling and mechanical tension synergy; AP-1 as master regulator; age-specific partners (JUNB vs HOXB13); ATAC-seq and RNA-seq integration; Perturbation experiments (siJUNB, T-5224, kinase inhibitors) QC Summary: - factual score: 10/10 - metadata score: 9/10 - supported core claims: 6 - claims flagged for review: 0 - metadata checks passed: 4 - metadata issues found: 0 Metadata Audited: - article_doi - article_title - article_journal - license - publish_description_text Factual Items Audited: - Young fibroblasts exhibit synergistic gene expression enhancement to combined mechanical tension and TGF-β; aging cells show a blunted/diverg...

    23 min
  7. 331: Bi-allelic NDUFA5 variants and complex I mitochondriopathy

    31 MAR

    331: Bi-allelic NDUFA5 variants and complex I mitochondriopathy

    Tan et al et al., The American Journal of Human Genetics - This report identifies bi-allelic NDUFA5 variants in four individuals from three families causing an isolated mitochondrial complex I deficiency with variable multisystem features. The study combines genomic, transcriptomic, proteomic, biochemical, structural modeling, and zebrafish functional data to support pathogenicity. Key terms: NDUFA5, complex I deficiency, mitochondriopathy, proteomics, zebrafish model. Study Highlights: Bi-allelic NDUFA5 variants were found in four individuals from three unrelated families presenting with a variable multisystem phenotype including congenital heart defects, hematological abnormalities, and Leigh-like neurological features. Multi-tissue RNA-seq and RT-PCR revealed aberrant splicing and NMD, while proteomics and BN-PAGE demonstrated reduced NDUFA5 protein, isolated complex I deficiency, and stalled assembly at a Q/P intermediate. CRISPR-Cas9 ndufa5 zebrafish crispants showed developmental delays, locomotor deficits, reduced survival, and epileptiform neural activity, corroborating functional impact. Conclusion: Combined clinical, molecular, and animal-model evidence supports that bi-allelic NDUFA5 variants cause a recessive mitochondriopathy with isolated complex I deficiency and variable multisystem involvement; NDUFA5 should be considered in molecular reanalysis of undiagnosed complex I disorders. Music: Enjoy the music based on this article at the end of the episode. Article title: Bi-allelic variants in NDUFA5 cause a mitochondriopathy with complex I deficiency First author: Tan et al Journal: The American Journal of Human Genetics DOI: 10.1016/j.ajhg.2026.03.003 Reference: Tan et al., 2026, The American Journal of Human Genetics 113, 1–14, May 7, 2026. https://doi.org/10.1016/j.ajhg.2026.03.003 License: CC BY (http://creativecommons.org/licenses/by/4.0/) Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/episodes/ndufa5-complex-i-mitochondriopathy QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-03-31. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Audited the transcript portions describing NDUFA5 variants, splicing consequences, proteomics/BN-PAGE findings, zebrafish model outcomes, and the diagnostic paradigm shift. - transcript topics: Complex I biology and mitochondrial energy metabolism; Gene discovery via GeneMatcher and patient cohorts; NDUFA5 variant classes and their consequences; RNA sequencing and exon skipping due to synonymous variant; Protein abundance and complex I assembly defects; Blue native PAGE and assembly intermediates QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 6 - claims flagged for review: 0 - metadata checks passed: 4 - metadata issues found: 0 Metadata Audited: - article_doi - article_title - article_journal - license Factual Items Audited: - Bi-allelic NDUFA5 variants cause a mitochondrial complex I deficiency with multisystem disease - Two distinct variant classes observed: frameshift + missense in Family 1; start-loss + synonymous splice variant in Family 2; homozygous synonymous splice variant in Family 3 - RNA-seq reveals aberrant splicing and nonsense-mediated decay for some alleles; exon 3 skipping yields a 39-amino-acid in-frame deletion - Proteomics shows ma...

    27 min
  8. 330: 5ULTRA: Mapping 5′ UTR variants that alter protein translation

    30 MAR

    330: 5ULTRA: Mapping 5′ UTR variants that alter protein translation

    Chaldebas M et al., The American Journal of Human Genetics - Chaldebas et al. present 5ULTRA, a computational pipeline that integrates uORF databases, Kozak-motif features, splicing prediction, and a random-forest score to detect and prioritize 5′ UTR variants predicted to alter protein translation. The score correlates with proteomic and MPRA measures and is applied to population, somatic, GWAS, and rare-disease datasets to nominate candidate functional variants. Key terms: 5' UTR, uORF, Kozak motif, translation regulation, machine learning. Study Highlights: The authors developed 5ULTRA to annotate SNVs, indels, and splicing variants that create/disrupt uORFs or alter Kozak strength, integrating comprehensive uORF databases and SpliceAI. A random-forest 5ULTRA score trained on HGMD and gnomAD distinguishes likely translation-impacting variants and achieved strong cross-validation performance and AUC = 0.82 on an independent ClinVar test. The score correlates with cis-pQTL effect sizes (Spearman rho = 0.57) and with MPRA ribosome-load measurements (rho = 0.78). Genome-wide screening found thousands of candidate variants, highlighted rare/conserved signals in disease genes, and nominated examples in cancer, GWAS loci, and rare infections. Conclusion: 5ULTRA provides a validated, transcript-aware framework to detect and prioritize 5′ UTR variants that modulate translation, offering mechanistic hypotheses for noncoding variant interpretation in rare disease, cancer, and complex-trait genetics; the tool and data are publicly available under a CC BY license. QC: This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-03-30. QC Scope: - article metadata and core scientific claims from the narration - excludes analogies, intro/outro, and music - transcript coverage: Substantive auditing focused on the scientific content described in the transcript and its alignment with the AJHG article: 5ULTRA architecture, features, SpliceAI integration, validation metrics, somatic/GWAS/infectious disease applications, limitations, and open-source availability. - transcript topics: 5′ UTR regulatory elements (Kozak motif, uORFs) and translation initiation; 5ULTRA methodology and data integration (MANE transcripts, uORFdb, Ribo-uORF, SpliceAI); Machine-learning scoring (17 features; PhyloP conservation as key predictor; uORF/k Kozak annotations); Model validation (ClinVar, cross-validation AUC, accuracy); Correlation with proteomics and MPRA data (cis-pQTL, ΔMRL); Somatic cancer applications (NRAS and ABI1 examples; splicing effects; N-terminal extensions) QC Summary: - factual score: 10/10 - metadata score: 10/10 - supported core claims: 7 - claims flagged for review: 0 - metadata checks passed: 7 - metadata issues found: 0 Metadata Audited: - article_doi - article_title - article_journal - license - episode_title - episode_number - season - reference Factual Items Audited: - 5ULTRA identifes and prioritizes 5′ UTR variants that affect translation via uORFs and Kozak motifs - 17 features used by the 5ULTRA random forest model; PhyloP conservation of uORF start codon as the strongest predictor - Genome-wide analysis: ~28 million 5′ UTR variants; ~137k predicted to affect translation via URFs or Kozak changes - ClinVar independent test AUC ≈ 0.82 and ClinVar threshold-based accuracy ≈ 80.8% - Cross-validation 5-fold AUC ≈ 0.981; MPRA and pQTL data show concordant translation effects (ΔMRL, Spearman ρ values ~0.78; 5ULTRA vs cis-pQTL ρ ≈ 0.57) QC result: Pass. Chapters (00:00:08) - Genome Wide Detection of Human 5 UTR Variants(00:06:41) - How a Deep Learning Algorithm Can Identify Dangerous Human Variants(00:12:35) - 5 Ultra: The computational genetics of cancer(00:18:46) - How to decode the secrets of the human genome

    23 min

About

Base by Base explores advances in genetics and genomics, with a focus on gene-disease associations, variant interpretation, protein structure, and insights from exome and genome sequencing. Each episode breaks down key studies and their clinical relevance—one base at a time. Powered by AI, Base by Base offers a new way to learn on the go. Special thanks to authors who publish under CC BY 4.0, making open-access science faster to share and easier to explore.

You Might Also Like