Open Source Sports

Ron Yurko
Open Source Sports

Ron Yurko and Kostas Pelechrinis host the 'Open Source Sports' podcast to serve as a public reading group for discussing the latest research in sports analytics. Each episode focuses on a single paper featuring authors as guests, with discussions about the statistical methodology, relevance and future directions of the research. opensourcesports.substack.com

Episodes

  1. 05/18/2022

    An Examination of Sport Climbing with Quang Nguyen

    We discuss An Examination of Olympic Sport Climbing Competition Format and Scoring System with Quang Nguyen (@qntkhvn). This paper won the Carnegie Mellon Sports Analytics Conference Reproducible Research Competition in November 2021.  Quang Nguyen completed his Master of Science in Applied Statistics at Loyola University Chicago in 2021. He recently spent the Spring 2022 semester working as an instructor in the Dept of Mathematics and Statistics at Loyola. Quang previously completed his undergraduate degree in Mathematics and Data Science at Wittenberg University in Springfield, Ohio. Quang's current interests include statistics in sports, data science, statistics and data science education, and reproducibility. He is a die-hard supporter of Manchester United F.C. of the English Premier League. And last but not least, Quang is excited to join the Dept of Statistics and Data Science at CMU as a first-year PhD student this coming Fall 2022. For additional references mentioned in the show: Quang's blog posts: https://qntkhvn.netlify.app/blog.html Code for paper: https://github.com/qntkhvn/climbing Inducing Any Feasible Level of Correlation to Bivariate Data With Any Marginals R copula package: https://cran.r-project.org/web/packages/copula/index.html and book: http://copula.r-forge.r-project.org/book/ UConn Sports Analytics Symposium (UCSAS)  CRAN Task View for Sports Analytics: https://cran.r-project.org/web/views/SportsAnalytics.html This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit opensourcesports.substack.com

    60 min
  2. Grinding the Mocks with Benjamin Robinson

    08/24/2021

    Grinding the Mocks with Benjamin Robinson

    We discuss Grinding the Bayes: A Hierarchical Modeling Approach to Predicting the NFL Draft with Benjamin Robinson (@benj_robinson). This paper was a finalist in the Carnegie Mellon Sports Analytics Conference Reproducible Research Competition in October 2020. You can submit an abstract to enter the 2021 Reproducible Research Competition now! Benjamin Robinson is a data scientist living in Washington, D.C. and the creator of Grinding the Mocks, where since 2018 he has used mock drafts, the wisdom of crowds, and data science to predict the NFL Draft.  He is a 2012 graduate of the University of Pittsburgh with degrees in Economics and Urban Studies and earned a Master of Public Policy degree from the University of Southern California in 2014.  You can follow him on Twitter @benj_robinson and find the Grinding the Mocks project at grindingthemocks.com and @GrindingMocks. For additional references mentioned in the show: Ben's bitbucket repository of data: https://bitbucket.org/benjamin_robinson/grindingthebayes/ Bayesian modeling in R with the brms package: https://paul-buerkner.github.io/brms/ CMSAC Reproducible Competition abstract submission: http://stat.cmu.edu/cmsac/conference/2021/#mu-research Saiem Gilani's (@SaiemGilani) collection of software: https://sportsdataverse.org/ This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit opensourcesports.substack.com

    1h 6m
  3. Bang the can slowly with Ryan Elmore and Gregory J. Matthews

    12/12/2020

    Bang the can slowly with Ryan Elmore and Gregory J. Matthews

    We discuss Bang the Can Slowly: An Investigation into the 2017 Houston Astros with Ryan Elmore (@rtelmore) and Gregory J. Matthews (@StatsInTheWild).  This paper was the winner of the Carnegie Mellon Sports Analytics Conference Reproducible Research Competition in October 2020. Ryan Elmore is an Assistant Professor in the Department of Business Information and Analytics in the Daniels College of Business at the University of Denver (DU). He earned his Ph.D. in statistics at Penn State University and worked as a Senior Scientist at the National Renewable Energy Laboratory prior to DU. He has over 20 peer reviewed publications in outlets such as Journal of the American Statistical Association, Biometrika, The American Statistician, Big Data, Journal of Applied Statistics, Journal of Sports Economics, among others. He is currently an Associate Editor for the Journal of Quantitative Analysis in Sports and recently organized the conference “Rocky Mountain Symposium on Analytics in Sports” hosted at DU. Gregory Matthews completed his Ph.D. In statistics at the University of Connecticut in 2011.  From 2011-2014, he was a post-doc in the School of Public Health at the University of Massachusetts-Amherst.  Since 2014, he has been a professor of statistics at Loyola University Chicago.  He was recently promoted to Associate professor with tenure in March 2020. For additional references mentioned in the show: Tony Adams' (@adams_at) Houston Astros trash can banging data website: http://signstealingscandal.com/ Ryan and Greg's GitHub repository with code and data: https://github.com/gjm112/Astros_sign_stealing The causal effect of a timeout at stopping an opposing run in the NBA by Connor Gibbs (@cgibbs_10), Ryan Elmore, and Bailey Fosdick (@baileyfosdick) This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit opensourcesports.substack.com

    1h 17m
  4. Player Chemistry in Soccer with Lotte Bransen

    09/13/2020

    Player Chemistry in Soccer with Lotte Bransen

    We discuss 'Player Chemistry: Striving for a Perfectly Balanced Soccer Team' with Lotte Bransen. This paper builds on the VAEP framework previously introduced Lotte and her colleagues, in order to quantify player chemistry. Our discussion covers details of the paper along with general challenges of estimating player chemistry in soccer and other sports, as well as the importance of interpretable machine learning. Lotte Bransen (@LotteBransen) is a Lead Data Scientist at SciSports, where she leads the Data Analytics team that develops analytical tools to derive actionable insights from soccer data. An avid soccer player herself, Lotte primarily works on developing machine learning models to measure the impact of soccer players’ in-game actions and decisions on the courses and outcomes of matches. Prior to SciSports, Lotte obtained a Master of Science degree in Econometrics & Management Science from Erasmus University Rotterdam and a Bachelor of Science degree in Mathematics from Utrecht University. References: 'Player Chemistry: Striving for a Perfectly Balanced Soccer Team' - https://arxiv.org/pdf/2003.01712.pdf 'Actions Speak Louder than Goals: Valuing Player Actions in Soccer' - https://arxiv.org/pdf/1802.07127.pdf 'Wide Open Spaces: A statistical technique for measuring space creation in professional soccer' - http://www.sloansportsconference.com/wp-content/uploads/2018/03/1003.pdf Interpretable Machine Learning - https://christophm.github.io/interpretable-ml-book/ San Francisco 49ers recently hired Harvard Biostatistics PhD Matt Ploenzke (@MPloenzke) whose thesis was on 'Interpretable Machine Learning Methods with Applications in Genomics' This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit opensourcesports.substack.com

    36 min
  5. Models for hockey player ratings with Andrew Thomas and Sam Ventura

    08/10/2020

    Models for hockey player ratings with Andrew Thomas and Sam Ventura

    In the third episode of the show we discuss 'Competing process hazard function models for player ratings in ice hockey' with two guests, Andrew Thomas and Sam Ventura.  The discussion ranges from paper details to thoughts on modeling in hockey and sports in general. Andrew Thomas (@acthomasca) is the Director of Data Science for SMT (SportsMEDIA Technology), and former lead hockey researcher for the Minnesota Wild. He received his PhD in Statistics at Harvard University. Sam Ventura is the Director of Hockey Research for the Pittsburgh Penguins, and an affiliated faculty member at Carnegie Mellon's Statistics & Data Science department, where he received his PhD in Statistics.  Along with Andrew, he is the co-creator of war-on-ice.com and nhlscrapr. Additionally, he is the co-creator of nflscrapr with Maksim Horowitz and Ron Yurko, which no longer works... Additional resources mentioned include: Previous work by Brian Macdonald, e.g. https://arxiv.org/abs/1201.0317 Total Hockey Rating by Michael Schuckers and James Curro Asmae Toumi - 'From Grapes and Prunes to Apples and Apples: Using Matched Methods to Estimate Optimal Zone Entry Decision-Making in the National Hockey League' And check out recent work by Micah Blake McCurdy and Evolving Wild This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit opensourcesports.substack.com

    1h 14m
  6. Rao-Blackwellizing FG% with Daniel Daly-Grafstein

    06/15/2020

    Rao-Blackwellizing FG% with Daniel Daly-Grafstein

    In the second episode we discuss two papers by our guest Daniel Daly-Grafstein and Luke Bornn: Rao-Blackwellizing field goal percentage (published in JQAS and available at: http://www.lukebornn.com/papers/dalygrafstein_jqas_2019.pdf) and Using In-Game Shot Trajectories to Better Understand Defensive Impact in the NBA (available at: https://arxiv.org/pdf/1905.00822.pdf). Daniel is currently a soccer data analyst at Sportlogiq, an sports AI company that, in soccer, focuses on generating tracking data using computer vision.  The papers discussed in this episode were part of Daniel’s Master's degree in statistics at Simon Fraser University. In the fall Daniel is going to be starting his PhD in Statistics at the University of British Columbia. Additional resources mentioned in the show: Daniel's GitHub repository: https://github.com/danieldalygrafstein/nba-raoblackwellizing-field-goal Sloan conference papers by: (1) Rachel Marty and Simon Lucey: A data-driven method for understanding and increasing 3-point shooting percentage (http://www.sloansportsconference.com/wp-content/uploads/2017/02/1505.pdf) and (2) Rachel Marty: High-resolution shot capture reveals systematic biases and an improved method for shooter evaluation (http://www.sloansportsconference.com/wp-content/uploads/2018/02/1005.pdf) Also you should read the wikipedia page on the Rao-Blackwell theorem: https://en.wikipedia.org/wiki/Rao%E2%80%93Blackwell_theorem This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit opensourcesports.substack.com

    44 min
4.8
out of 5
13 Ratings

About

Ron Yurko and Kostas Pelechrinis host the 'Open Source Sports' podcast to serve as a public reading group for discussing the latest research in sports analytics. Each episode focuses on a single paper featuring authors as guests, with discussions about the statistical methodology, relevance and future directions of the research. opensourcesports.substack.com

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign in or sign up to follow shows, save episodes, and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada