A podcast about computational biology, bioinformatics, and next generation sequencing.
Polygenic risk scores in admixed populations with Bárbara Bitarello
Polygenic risk scores (PRS) rely on the genome-wide association studies (GWAS)
to predict the phenotype based on the genotype. However, the prediction
accuracy suffers when GWAS from one population are used to calculate PRS within
a different population, which is a problem because the majority of the GWAS
are done on cohorts of European ancestry.
In this episode, Bárbara Bitarello helps us
understand how PRS work and why they don’t transfer well across populations.
Polygenic Scores for Height in Admixed Populations (Bárbara D. Bitarello, Iain Mathieson)
What is ancestry? (Iain Mathieson, Aylwyn Scally)
Phylogenetics and the likelihood gradient with Xiang Ji
In this episode, we chat about phylogenetics with Xiang Ji. We start with a
general introduction to the field and then go deeper into the likelihood-based
methods (maximum likelihood and Bayesian inference). In particular, we talk
about the different ways to calculate the likelihood gradient, including a
linear-time exact gradient algorithm recently published by Xiang and his
Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics
(Xiang Ji, Zhenyu Zhang, Andrew Holbrook, Akihiko Nishimura, Guy Baele, Andrew Rambaut, Philippe Lemey, Marc A Suchard)
BEAGLE: the package that implements the gradient algorithm
BEAST: the program that implements the Hamiltonian Monte Carlo sampler and the molecular clock models
Seeding methods for read alignment with Markus Schmidt
In this episode, Markus Schmidt explains how seeding in read alignment works.
We define and compare k-mers, minimizers, MEMs, SMEMs, and maximal spanning seeds.
Markus also presents his recent work on computing variable-sized seeds (MEMs,
SMEMs, and maximal spanning seeds) from fixed-sized seeds (k-mers and
minimizers) and his Modular Aligner.
A performant bridge between fixed-size and variable-size seeding
(Arne Kutzner, Pok-Son Kim, Markus Schmidt)
MA the Modular Aligner
Calibrating Seed-Based Heuristics to Map Short Reads With Sesame
(Guillaume J. Filion, Ruggero Cortini, Eduard Zorita) — another
interesting recent work on seeding methods (though we didn’t get to discuss
it in this episode)
Real-time quantitative proteomics with Devin Schweppe
In this episode, Jacob Schreiber interviews Devin Schweppe about
the analysis of mass spectrometry data in the field of proteomics. They begin
by delving into the different types of mass spectrometry methods, including MS1,
MS2, and, MS3, and the reasons for using each. They then discuss a recent paper
from Devin, Full-Featured, Real-Time Database Searching Platform Enables Fast
and Accurate Multiplexed Quantitative Proteomics that involved building a
real-time system for quantifying proteomic samples from MS3, and the types of
analyses that this system allows one to do.
Full-Featured, Real-Time Database Searching Platform Enables Fast and Accurate Multiplexed Quantitative Proteomics (Devin K. Schweppe, Jimmy K. Eng, Qing Yu, Derek Bailey, Ramin Rad, Jose Navarrete-Perea, Edward L. Huttlin, Brian K. Erickson, Joao A. Paulo, and Steven P. Gygi)
Benchmarking the Orbitrap Tribrid Eclipse for Next Generation Multiplexed Proteomics (Qing Yu, Joao A Paulo, Jose Naverrete-Perea, Graeme C McAlister, Jesse D Canterbury, Derek J Bailey, Aaron M Robitaille, Romain Huguet, Vlad Zabrouskov, Steven P Gygi, Devin K Schweppe)
Improved Monoisotopic Mass Estimation for Deeper Proteome Coverage (Ramin Rad, Jiaming Li, Julian Mintseris, Jeremy O’Connell, Steven P. Gygi, and Devin K. Schweppe)
Schweppe Lab Website (Hiring!)
How 23andMe finds identical-by-descent segments with William Freyman
In this episode, Will Freyman talks about identity-by-descent (IBD): how
it’s used at 23andMe, and how the templated
positional Burrows-Wheeler transform can find IBD segments in the presence of
genotyping and phasing errors.
Fast and robust identity-by-descent inference with the templated positional Burrows-Wheeler transform
(William A. Freyman, Kimberly F. McManus, Suyash S. Shringarpure, Ethan M. Jewett, Katarzyna Bryc, the 23andMe Research Team, Adam Auton)
Basset and Basenji with David Kelley
In this episode, Jacob Schreiber interviews David Kelley about
machine learning models that can yield insight into the consequences of
mutations on the genome. They begin their discussion by talking about
Calico Labs, and then delve into a series of papers that David has
written about using models, named Basset and Basenji, that connect genome
sequence to functional activity and so can be used to quantify the effect of
Basset: Learning the regulatory code of the accessible genome with deep convolutional neural networks (David R. Kelley, Jasper Snoek, and John Rinn)
Sequential regulatory activity prediction across chromosomes with convolutional neural networks (David R. Kelley, Yakir A. Reshef, Maxwell Bileschi, David Belanger, Cory Y. McLean, and Jaspar Snoek)
Cross-species regulatory sequence activity prediction (David R. Kelley)
Basenji GitHub Repo