56 episodes

A podcast about computational biology, bioinformatics, and next generation sequencing.

the bioinformatics cha‪t‬ Roman Cheplyaka

    • Life Sciences

A podcast about computational biology, bioinformatics, and next generation sequencing.

    Polygenic risk scores in admixed populations with Bárbara Bitarello

    Polygenic risk scores in admixed populations with Bárbara Bitarello

    Polygenic risk scores (PRS) rely on the genome-wide association studies (GWAS)
    to predict the phenotype based on the genotype. However, the prediction
    accuracy suffers when GWAS from one population are used to calculate PRS within
    a different population, which is a problem because the majority of the GWAS
    are done on cohorts of European ancestry.


    In this episode, Bárbara Bitarello helps us
    understand how PRS work and why they don’t transfer well across populations.





    Links:



    Polygenic Scores for Height in Admixed Populations (Bárbara D. Bitarello, Iain Mathieson)
    What is ancestry? (Iain Mathieson, Aylwyn Scally)

    • 1 hr 30 min
    Phylogenetics and the likelihood gradient with Xiang Ji

    Phylogenetics and the likelihood gradient with Xiang Ji

    In this episode, we chat about phylogenetics with Xiang Ji. We start with a
    general introduction to the field and then go deeper into the likelihood-based
    methods (maximum likelihood and Bayesian inference). In particular, we talk
    about the different ways to calculate the likelihood gradient, including a
    linear-time exact gradient algorithm recently published by Xiang and his
    colleagues.






    Links:



    Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics
    (Xiang Ji, Zhenyu Zhang, Andrew Holbrook, Akihiko Nishimura, Guy Baele, Andrew Rambaut, Philippe Lemey, Marc A Suchard)
    BEAGLE: the package that implements the gradient algorithm
    BEAST: the program that implements the Hamiltonian Monte Carlo sampler and the molecular clock models

    • 57 min
    Seeding methods for read alignment with Markus Schmidt

    Seeding methods for read alignment with Markus Schmidt

    In this episode, Markus Schmidt explains how seeding in read alignment works.
    We define and compare k-mers, minimizers, MEMs, SMEMs, and maximal spanning seeds.
    Markus also presents his recent work on computing variable-sized seeds (MEMs,
    SMEMs, and maximal spanning seeds) from fixed-sized seeds (k-mers and
    minimizers) and his Modular Aligner.






    Links:



    A performant bridge between fixed-size and variable-size seeding
    (Arne Kutzner, Pok-Son Kim, Markus Schmidt)
    MA the Modular Aligner
    Calibrating Seed-Based Heuristics to Map Short Reads With Sesame
    (Guillaume J. Filion, Ruggero Cortini, Eduard Zorita) — another
    interesting recent work on seeding methods (though we didn’t get to discuss
    it in this episode)

    • 1 hr
    Real-time quantitative proteomics with Devin Schweppe

    Real-time quantitative proteomics with Devin Schweppe

    In this episode, Jacob Schreiber interviews Devin Schweppe about
    the analysis of mass spectrometry data in the field of proteomics. They begin
    by delving into the different types of mass spectrometry methods, including MS1,
    MS2, and, MS3, and the reasons for using each. They then discuss a recent paper
    from Devin, Full-Featured, Real-Time Database Searching Platform Enables Fast
    and Accurate Multiplexed Quantitative Proteomics that involved building a
    real-time system for quantifying proteomic samples from MS3, and the types of
    analyses that this system allows one to do.






    Links:



    Full-Featured, Real-Time Database Searching Platform Enables Fast and Accurate Multiplexed Quantitative Proteomics (Devin K. Schweppe, Jimmy K. Eng, Qing Yu, Derek Bailey, Ramin Rad, Jose Navarrete-Perea, Edward L. Huttlin, Brian K. Erickson, Joao A. Paulo, and Steven P. Gygi)
    Benchmarking the Orbitrap Tribrid Eclipse for Next Generation Multiplexed Proteomics (Qing Yu, Joao A Paulo, Jose Naverrete-Perea, Graeme C McAlister, Jesse D Canterbury, Derek J Bailey, Aaron M Robitaille, Romain Huguet, Vlad Zabrouskov, Steven P Gygi, Devin K Schweppe)
    Improved Monoisotopic Mass Estimation for Deeper Proteome Coverage (Ramin Rad, Jiaming Li, Julian Mintseris, Jeremy O’Connell, Steven P. Gygi, and Devin K. Schweppe)
    Schweppe Lab Website (Hiring!)

    How 23andMe finds identical-by-descent segments with William Freyman

    How 23andMe finds identical-by-descent segments with William Freyman

    In this episode, Will Freyman talks about identity-by-descent (IBD): how
    it’s used at 23andMe, and how the templated
    positional Burrows-Wheeler transform can find IBD segments in the presence of
    genotyping and phasing errors.





    Links:



    Fast and robust identity-by-descent inference with the templated positional Burrows-Wheeler transform
    (William A. Freyman, Kimberly F. McManus, Suyash S. Shringarpure, Ethan M. Jewett, Katarzyna Bryc, the 23andMe Research Team, Adam Auton)
    23andMe research

    • 42 min
    Basset and Basenji with David Kelley

    Basset and Basenji with David Kelley

    In this episode, Jacob Schreiber interviews David Kelley about
    machine learning models that can yield insight into the consequences of
    mutations on the genome. They begin their discussion by talking about
    Calico Labs, and then delve into a series of papers that David has
    written about using models, named Basset and Basenji, that connect genome
    sequence to functional activity and so can be used to quantify the effect of
    any mutation.






    Links:



    Calico Labs
    Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks (David R. Kelley, Jasper Snoek, and John Rinn)
    Sequential regulatory activity prediction across chromosomes with convolutional neural networks (David R. Kelley, Yakir A. Reshef, Maxwell Bileschi, David Belanger, Cory Y. McLean, and Jaspar Snoek)
    Cross-species regulatory sequence activity prediction (David R. Kelley)
    Basenji GitHub Repo

    • 1 hr 13 min

Top Podcasts In Life Sciences

Listeners Also Subscribed To