Base by Base

Gustavo Barra

Base by Base explores advances in genetics and genomics, with a focus on gene-disease associations, variant interpretation, protein structure, and insights from exome and genome sequencing. Each episode breaks down key studies and their clinical relevance—one base at a time. Powered by AI, Base by Base offers a new way to learn on the go. Special thanks to authors who publish under CC BY 4.0, making open-access science faster to share and easier to explore.

  1. 16H AGO

    293: IndeLLM (ESM2) zero-shot scoring and Siamese transfer learning for in-frame indel prediction (MCC 0.77)

    Gracia Carmona O et al., Patterns. 7 ( - IndeLLM uses protein language models (ESM2) to score in-frame indels and a compact Siamese transfer-learning model that achieves state-of-the-art pathogenicity prediction with MCC = 0.77. Study Highlights: Using human protein sequences and ESM2 embeddings, the authors develop IndeLLM, a zero-shot scoring function that sums overlapping-region probabilities to correct length bias in in-frame indels. They train a compact Siamese one-hidden-layer network on PLM embeddings with biologically guided embedding splitting and achieve MCC = 0.77 on the test set. Per-residue probability differences mapped onto structures (FGFR1, GLMN) identify local regions affected by indels and improve interpretability. The framework reduces insertion false negatives and is released with Colab and GitHub tools for indel annotation and disease-variant analysis. Conclusion: IndeLLM zero-shot scoring and a small Siamese transfer-learning model provide improved, interpretable indel pathogenicity prediction, with the Siamese model achieving MCC = 0.77. Music: Enjoy the music based on this article at the end of the episode. Reference: Gracia Carmona O, Leipart V, Amdam GV, Orengo C, Fraternali F. Leveraging protein language models and a scoring function for indel characterization and transfer learning. Patterns. 7 (2026) 101425. https://doi.org/10.1016/j.patter.2025.101425 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/indellm-indel-siamese-model

    18 min
  2. 17H AGO

    292: INS R6C signal-peptide defect reduces preproinsulin ER translocation in iPSC-derived βcells

    Tong Y et al., EMBO Molecular Medicine - Patient data, population genetics and iPSC-derived βcell models show INS R6C impairs preproinsulin ER translocation and causes recessive insulin-deficient diabetes in homozygotes. Study Highlights: The study integrates clinical pedigrees, large-scale population screens and patient-derived iPSC βcell models, plus in vivo transplantation and transcriptomics. AlphaFold 3 structural modeling suggested weakened SRP54 interaction and altered SEC61 orientation for the R6C signal peptide, and population data showed no enrichment of diabetes among heterozygotes. In homozygous R6C iPSC-βcells up to two-thirds of nascent preproinsulin failed to translocate, preproinsulin accumulated, and insulin content and secretion were reduced by roughly 50% with altered proinsulin processing. Functionally, homozygous R6C grafts produced minimal human C-peptide in mice and responded poorly to GLP-1 receptor agonists, while heterozygotes were largely compensated by a single wild-type allele. Conclusion: INS R6C is a recessive loss-of-function mutation that causes early-onset, insulin-deficient diabetes in homozygotes while heterozygous carriers show variable or absent glycemic phenotypes. Music: Enjoy the music based on this article at the end of the episode. Reference: Tong Y, Becker M, Schierloh U, Natividade da Silva F, Haataja L, Cai Y, Patel KA, Kobaisi F, Mirshahi UL, Colclough K, Javed MS, Wakeling MN, Fantuzzi F, Lytrivi M, Sawatani T, Arroyo MN, Yi X, Vinci C, Montaser H, Pachera N, Otonkoski T, Igoillo-Esteve M, Scharfmann R, Hattersley AT, Arvan P, De Beaufort C, Cnop M. A new form of diabetes caused by INS mutations defined by zygosity, stem cell and population data. EMBO Molecular Medicine. 2026;18:620–645. https://doi.org/10.1038/s44321-025-00362-9 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/ins-r6c-recessive-diabetes

    17 min
  3. 2D AGO

    291: Dated gene duplications show Asgard archaeal host complexity before mitochondrial endosymbiosis

    Kay CJ et al., Nature 650, 129–140 ( - Relaxed-clock dating of pre-LECA gene duplications in Asgard archaeal and alphaproteobacterial lineages shows a complex archaeal host with cytoskeleton, endomembrane system and nucleus before mitochondrial acquisition around 2.2 Ga. Study Highlights: The study interrogates eukaryogenesis using a time-resolved species tree and a sequential Bayesian relaxed molecular clock applied to 135 gene family trees. Analyses with MCMCTree dated divergence nodes (nFECA 3.05–2.79 Ga, mFECA 2.37–2.13 Ga, LECA 1.80–1.67 Ga) and pre-LECA duplication events of archaeal and bacterial origin. Most archaeal-derived duplications (about 85% of those sampled) were fixed prior to inferred mitochondrial endosymbiosis, with archaeal duplications for actin, tubulin, vesicle trafficking and spliceosomal components dated between ~3.0 and ~2.25 Ga. Functionally, these timings indicate the host lineage had an elaborated cytoskeleton, endomembrane trafficking and nuclear compartmentalization before mitochondrial integration, while alphaproteobacterial duplications cluster near ~2.2 Ga consistent with mitochondrial establishment. Conclusion: Gene-duplication dating supports a complexified archaeal host with cytoskeleton, endomembrane system and nucleus preceding mitochondrial endosymbiosis, favoring a late-mitochondrion model of eukaryogenesis. Music: Enjoy the music based on this article at the end of the episode. Reference: Kay CJ, Spang A, Szöllősi GJ, Pisani D, Williams TA & Donoghue PCJ. Dated gene duplications elucidate the evolutionary assembly of eukaryotes. Nature 650, 129–140 (2026). https://doi.org/10.1038/s41586-025-09808-z License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.com/?episode=291-dated-gene-duplications-show-asgard-archaeal-host-comple-605svf

    7 min
  4. 3D AGO

    290: SMN1 p.Arg288AlafsTer5 exon 7 deletions evade PCR newborn screening yet yield functional SMN isoform

    Wirth B et al., The American Journal of Human Genetics - Two SMN1 exon 7 4-bp deletions (p.Arg288AlafsTer5) evade standard PCR newborn screening but produce a low-abundance, thermostable SMN protein that functionally rescues smn1-deficient zebrafish and averted therapy. Study Highlights: In two clinically healthy newborns flagged as lacking SMN1 by PCR-based NBS, long-range SMN1-specific PCR, Sanger sequencing, MLPA and ddPCR identified distinct 4-bp exon 7 deletions producing the same frameshift p.Arg288AlafsTer5. Cellular assays showed preserved exon 7 splicing, markedly reduced SMN protein abundance, and unchanged protein thermostability, while AlphaFold3 predicted only mild C-terminal structural alteration. Functional complementation in smn1-deficient zebrafish—using both mRNA injection and a stable Tg(UBI-mKate_SMN1-861VUS) transgene—fully rescued morphology, motor behavior, and survival. Population gnomAD analysis indicates these variants are rare but present in Europeans at a carrier frequency that predicts hundreds of compound heterozygotes without reported SMA, informing diagnostic sequencing and avoidance of unnecessary therapy. Conclusion: Integrated genetic, functional, structural, and population-level evidence supports likely non-pathogenic reclassification of the SMN1 c.855_858delAGAA and c.861_864delAAGG alleles and shows that very low levels of the altered SMN protein can preserve normal motor development. Music: Enjoy the music based on this article at the end of the episode. Reference: Wirth B., Das J., Kölbel H., Goh S., Farrar M.A., Piano V., Zetzsche S., Fuhrmann N., Becker J., Karakaya M., Zhang Y., Cao Y., Taghipour-Sheshdeh A., Stringer B.W., Giacomotto J., et al. SMN1 variants identified by false-positive SMA newborn screening tests: Therapeutic hurdles and functional and epidemiological solutions. The American Journal of Human Genetics. 2026 Mar 5;113:1–9. https://doi.org/10.1016/j.ajhg.2026.01.012 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/smn1-exon7-frameshift-variants

    20 min
  5. 289: MinION detection of chimeric reads in murine Ifna/Ifnb and Actb amplicons

    4D AGO

    289: MinION detection of chimeric reads in murine Ifna/Ifnb and Actb amplicons

    White R et al., F1000Research - Nanopore MinION sequencing of murine Ifna, Ifnb and Actb amplicons identified post-amplification chimeric reads during library ligation, with ~1.7% of mapped reads containing chimeric elements. Study Highlights: cDNA from a Nippostrongylus brasiliensis–treated mouse amplified for Ifna family, Ifnb and Actb amplicons. Key methods: separate PCR amplicons were barcoded, ligated using two methods (quick vs overnight), sequenced on an Oxford Nanopore MinION and base-called/mapped with Albacore and LAST. Main quantitative result: 4,563 reads (≈1.7% of amplicon/barcode-mappable 1D reads) were classified as definitively chimeric, with repeated identical amplicons most common and overnight ligation associated with more repeated-amplicon chimeras than quick ligation. Functional implication: post-amplification ligation during library prep can create detectable chimeric reads that long-read nanopore data and raw-signal inspection can identify and allow filtering, with implications for amplicon sequencing workflows and index switching concerns. Conclusion: Post-amplification ligation during library preparation produced detectable chimeric reads (~1.7%), and including identifiable adapters with careful long-read QC enables detection and exclusion of most chimeras. Music: Enjoy the music based on this article at the end of the episode. Reference: White R, Pellefigues C, Ronchese F, Lamiable O, Eccles D. Investigation of chimeric reads using the MinION [version 2; peer review: 2 approved]. F1000Research. 2017;6:631. https://doi.org/10.12688/f1000research.11547.2 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/minion-chimeric-reads-ligation-ifna

    16 min
  6. 5D AGO

    288: Cryo-EM of rat cerebellar α1/α6 GABAA receptors reveals PZ‑II‑029 binding and β-α-β-α-γ assemblies

    Sun C et al., PNAS - Using cryo-EM and mass spectrometry in rat cerebellum, α1- and α6-containing GABAA receptor assemblies (β‑α‑β‑α‑γ stoichiometry) and PZ-II-029 binding were defined. Study Highlights: The study used rat cerebellum tissue and combined confocal immunofluorescence, affinity purification with mass spectrometry, and high-resolution cryo-EM to characterize native α1-containing GABAA receptors. Cryo-EM classification resolved eight distinct α1-containing receptor assemblies that conform to a conserved β-α-β-α-γ arrangement and include previously unreported α6-containing heteromers. Structural models of α6-containing receptors show near-symmetric ECD architecture with conserved GABA-binding geometry and distinct electrostatic differences that may affect ligand kinetics. Binding of the α6-selective pyrazoloquinolinone PZ-II-029 at α+/γ− pockets was visualized and found to induce coordinated outward expansion of the extracellular domain, providing a structural basis for subtype-selective modulation. Conclusion: Native α1-containing cerebellar GABAA receptors adopt a conserved β-α-β-α-γ pentameric scaffold that includes α6 subunits and binds PZ-II-029 at α+/γ− sites, producing extracellular domain expansion. Music: Enjoy the music based on this article at the end of the episode. Reference: Sun C, Jahncke JN, Wright KM, Gouaux E. Molecular assemblies and pharmacology of cerebellar GABAA receptors. Proc. Natl. Acad. Sci. U.S.A. 2026;123:e2524504123. Published February 6, 2026. https://doi.org/10.1073/pnas.2524504123 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/cerebellar-gabaa-alpha6-assemblies

    17 min
  7. 6D AGO

    287: EPOP and MTF2 modulate PRC2 H3K27me3 deposition via GA- and GCN-sequence specificity

    Granata J et al., PNAS - In mESCs and defined in vitro assays, EPOP and MTF2 stimulate PRC2 methyltransferase activity and promote de novo H3K27me3 deposition with GA- or GCN-rich DNA preference. Study Highlights: The study used mouse embryonic stem cells with an EED-rescue system and recombinant in vitro assays including HMT assays, EMSA, and ChIP-seq to probe EPOP and MTF2 function. Biochemical HMT assays on oligonucleosomes and dinucleosomes show both EPOP and MTF2 directly stimulate PRC2 catalytic activity, with MTF2 preferentially enhancing activity and binding on GCN-rich linkers and EPOP on GA-rich linkers. ChIP-seq during EED rescue demonstrated that EPOP is dispensable for initial PRC2 recruitment but its knockout reduces de novo H3K27me3 deposition by ~50% and cooperates with MTF2 and JARID2. Together these data indicate linker DNA sequence within nucleation sites guides subcomplex-specific PRC2 binding and catalytic output, influencing spatial establishment of H3K27me3 domains. Conclusion: EPOP and MTF2 define distinct PRC2 subcomplexes that stimulate PRC2 catalytic activity in a chromatin-dependent, DNA-sequence-specific manner to direct de novo H3K27me3 deposition. Music: Enjoy the music based on this article at the end of the episode. Reference: Granata J., Liu S., Popoca L., Oksuz O., Reinberg D. EPOP and MTF2 activate PRC2 activity through DNA-sequence specificity. Proc. Natl. Acad. Sci. U.S.A. 2026;123:e2527303123. https://doi.org/10.1073/pnas.2527303123 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/epop-mtf2-prc2-sequence

    19 min
  8. FEB 10

    286: Deep mutational scanning of Nipah virus fusion protein F reveals functional and antigenic constraints

    Larsen BB et al., PNAS - Deep mutational scanning of the Nipah virus fusion protein F using pseudoviruses maps ~8,500 single-residue effects, showing F is highly constrained and identifying antibody-escape mutations. Study Highlights: Using nonreplicative lentiviral pseudoviruses and deep mutational scanning, the authors measured the effects of 8,449 single amino-acid mutations to the Nipah virus F ectodomain on cell entry in CHO cells expressing bat ephrin-B3. Measurements were fit with global epistasis models and mapped onto prefusion and postfusion structures, revealing the fusion peptide, lateral surface patches, and hexameric-interface residues are highly constrained. The library was screened against six monoclonal antibodies, quantifying mutation-mediated decreases in neutralization and showing distinct resilience among antibodies; specific Hendra F residues (Q70K, R336K) explained loss or reduction of neutralization by 4H3 and 1A9. The data nominate candidate proline substitutions and other sites for prefusion stabilization and inform vaccine and therapeutic antibody selection. Conclusion: Nipah virus F is highly functionally constrained relative to RBP with specific surface-exposed and core residues critical for cell entry, and antibody neutralization varies by epitope, informing prefusion-stabilized immunogen and therapeutic antibody design. Music: Enjoy the music based on this article at the end of the episode. Reference: Larsen BB, Harari S, Gen R, Stewart C, Veesler D, Bloom JD. Functional and antigenic constraints on the Nipah virus fusion protein. Proc. Natl. Acad. Sci. U.S.A. 2026;123:e2529505123. https://doi.org/10.1073/pnas.2529505123 License: This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/ Support: Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00 Official website https://basebybase.com On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics. Episode link: https://basebybase.castos.com/episodes/nipah-f-deep-mutational-map

    20 min

Ratings & Reviews

3
out of 5
2 Ratings

About

Base by Base explores advances in genetics and genomics, with a focus on gene-disease associations, variant interpretation, protein structure, and insights from exome and genome sequencing. Each episode breaks down key studies and their clinical relevance—one base at a time. Powered by AI, Base by Base offers a new way to learn on the go. Special thanks to authors who publish under CC BY 4.0, making open-access science faster to share and easier to explore.