31 episodes

A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.

Vanishing Gradients Hugo Bowne-Anderson

    • Technology
    • 5.0 • 10 Ratings

A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.

    Episode 31: Rethinking Data Science, Machine Learning, and AI

    Episode 31: Rethinking Data Science, Machine Learning, and AI

    Hugo speaks with Vincent Warmerdam, a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.


    In this episode, they dive deep into rethinking established methods in data science, machine learning, and AI. We explore Vincent's principled approach to the field, including:



    The critical importance of exposing yourself to real-world problems before applying ML solutions
    Framing problems correctly and understanding the data generating process
    The power of visualization and human intuition in data analysis
    Questioning whether algorithms truly meet the actual problem at hand
    The value of simple, interpretable models and when to consider more complex approaches
    The importance of UI and user experience in data science tools
    Strategies for preventing algorithmic failures by rethinking evaluation metrics and data quality
    The potential and limitations of LLMs in the current data science landscape
    The benefits of open-source collaboration and knowledge sharing in the community


    Throughout the conversation, Vincent illustrates these principles with vivid, real-world examples from his extensive experience in the field. They also discuss Vincent's thoughts on the future of data science and his call to action for more knowledge sharing in the community through blogging and open dialogue.


    LINKS



    The livestream on YouTube
    Vincent's blog
    CalmCode
    scikit-lego
    Vincent's book Data Science Fiction (WIP)
    The Deon Checklist, an ethics checklist for data scientists
    Of oaths and checklists, by DJ Patil, Hilary Mason and Mike Loukides
    Vincent's Getting Started with NLP and spaCy Course course on Talk Python
    Vincent on twitter
    :probabl. on twitter
    Vincent's PyData Amsterdam Keynote "Natural Intelligence is All You Need [tm]"
    Vincent's PyData Amsterdam 2019 talk: The profession of solving (the wrong problem)
    Vanishing Gradients on Twitter
    Hugo on Twitter


    Check out and subcribe to our lu.ma calendar for upcoming livestreams!

    • 1 hr 36 min
    Episode 30: Lessons from a Year of Building with LLMs (Part 2)

    Episode 30: Lessons from a Year of Building with LLMs (Part 2)

    Hugo speaks about Lessons Learned from a Year of Building with LLMs with Eugene Yan from Amazon, Bryan Bischof from Hex, Charles Frye from Modal, Hamel Husain from Parlance Labs, and Shreya Shankar from UC Berkeley.


    These five guests, along with Jason Liu who couldn't join us, have spent the past year building real-world applications with Large Language Models (LLMs). They've distilled their experiences into a report of 42 lessons across operational, strategic, and tactical dimensions, and they're here to share their insights.


    We’ve split this roundtable into 2 episodes and, in this second episode, we'll explore:



    An inside look at building end-to-end systems with LLMs;
    The experimentation mindset: Why it's the key to successful AI products;
    Building trust in AI: Strategies for getting stakeholders on board;
    The art of data examination: Why looking at your data is more crucial than ever;
    Evaluation strategies that separate the pros from the amateurs.


    Although we're focusing on LLMs, many of these insights apply broadly to data science, machine learning, and product development, more generally.


    LINKS



    The livestream on YouTube
    The Report: What We’ve Learned From A Year of Building with LLMs
    About the Guests/Authors -- connect with them all on LinkedIn, follow them on Twitter, subscribe to their newsletters! (Seriously, though, the amount of collective wisdom here is 🤑
    Your AI product needs evals by Hamel Husain
    Prompting Fundamentals and How to Apply them Effectively by Eugene Yan
    F**k You, Show Me The Prompt by Hamel Husain
    Vanishing Gradients on YouTube
    Vanishing Gradients on Twitter
    Vanishing Gradients on Lu.ma

    • 1 hr 15 min
    Episode 29: Lessons from a Year of Building with LLMs (Part 1)

    Episode 29: Lessons from a Year of Building with LLMs (Part 1)

    Hugo speaks about Lessons Learned from a Year of Building with LLMs with Eugene Yan from Amazon, Bryan Bischof from Hex, Charles Frye from Modal, Hamel Husain from Parlance Labs, and Shreya Shankar from UC Berkeley.


    These five guests, along with Jason Liu who couldn't join us, have spent the past year building real-world applications with Large Language Models (LLMs). They've distilled their experiences into a report of 42 lessons across operational, strategic, and tactical dimensions, and they're here to share their insights.


    We’ve split this roundtable into 2 episodes and, in this first episode, we'll explore:



    The critical role of evaluation and monitoring in LLM applications and why they're non-negotiable, including "evals" - short for evaluations, which are automated tests for assessing LLM performance and output quality;
    Why data literacy is your secret weapon in the AI landscape;
    The fine-tuning dilemma: when to do it and when to skip it;
    Real-world lessons from building LLM applications that textbooks won't teach you;
    The evolving role of data scientists and AI engineers in the age of AI.


    Although we're focusing on LLMs, many of these insights apply broadly to data science, machine learning, and product development, more generally.


    LINKS



    The livestream on YouTube
    The Report: What We’ve Learned From A Year of Building with LLMs
    About the Guests/Authors -- connect with them all on LinkedIn, follow them on Twitter, subscribe to their newsletters! (Seriously, though, the amount of collective wisdom here is 🤑
    Your AI product needs evals by Hamel Husain
    Prompting Fundamentals and How to Apply them Effectively by Eugene Yan
    F**k You, Show Me The Prompt by Hamel Husain
    Vanishing Gradients on YouTube
    Vanishing Gradients on Twitter
    Vanishing Gradients on Lu.ma

    • 1 hr 30 min
    Episode 28: Beyond Supervised Learning: The Rise of In-Context Learning with LLMs

    Episode 28: Beyond Supervised Learning: The Rise of In-Context Learning with LLMs

    Hugo speaks with Alan Nichol, co-founder and CTO of Rasa, where they build software to enable developers to create enterprise-grade conversational AI and chatbot systems across industries like telcos, healthcare, fintech, and government.


    What's super cool is that Alan and the Rasa team have been doing this type of thing for over a decade, giving them a wealth of wisdom on how to effectively incorporate LLMs into chatbots - and how not to. For example, if you want a chatbot that takes specific and important actions like transferring money, do you want to fully entrust the conversation to one big LLM like ChatGPT, or secure what the LLMs can do inside key business logic?


    In this episode, they also dive into the history of conversational AI and explore how the advent of LLMs is reshaping the field. Alan shares his perspective on how supervised learning has failed us in some ways and discusses what he sees as the most overrated and underrated aspects of LLMs.


    Alan offers advice for those looking to work with LLMs and conversational AI, emphasizing the importance of not sleeping on proven techniques and looking beyond the latest hype. In a live demo, he showcases Rasa's Calm (Conversational AI with Language Models), which allows developers to define business logic declaratively and separate it from the LLM, enabling reliable execution of conversational flows.


    LINKS



    The livestream on YouTube
    Alan's Rasa CALM Demo: Building Conversational AI with LLMs
    Alan on twitter.com
    Rasa
    CALM, an LLM-native approach to building reliable conversational AI
    Task-Oriented Dialogue with In-Context Learning
    'We don’t know how to build conversational software yet' by Alan Nicol
    Vanishing Gradients on Twitter
    Hugo on Twitter


    Upcoming Livestreams



    Lessons from a Year of Building with LLMs
    VALIDATING THE VALIDATORS with Shreya Shanker

    • 1 hr 5 min
    Episode 27: How to Build Terrible AI Systems

    Episode 27: How to Build Terrible AI Systems

    Hugo speaks with Jason Liu, an independent consultant who uses his expertise in recommendation systems to help fast-growing startups build out their RAG applications. He was previously at Meta and Stitch Fix is also the creator of Instructor, Flight, and an ML and data science educator.


    They talk about how Jason approaches consulting companies across many industries, including construction and sales, in building production LLM apps, his playbook for getting ML and AI up and running to build and maintain such apps, and the future of tooling to do so.


    They take an inverted thinking approach, envisaging all the failure modes that would result in building terrible AI systems, and then figure out how to avoid such pitfalls.


    LINKS



    The livestream on YouTube
    Jason's website
    PyDdantic is all you need, Jason's Keynote at AI Engineer Summit, 2023
    How to build a terrible RAG system by Jason
    To express interest in Jason's Systematically improving RAG Applications course
    Vanishing Gradients on Twitter
    Hugo on Twitter


    Upcoming Livestreams



    Good Riddance to Supervised Learning with Alan Nichol (CTO and co-founder, Rasa)
    Lessons from a Year of Building with LLMs

    • 1 hr 32 min
    Episode 26: Developing and Training LLMs From Scratch

    Episode 26: Developing and Training LLMs From Scratch

    Hugo speaks with Sebastian Raschka, a machine learning & AI researcher, programmer, and author. As Staff Research Engineer at Lightning AI, he focuses on the intersection of AI research, software development, and large language models (LLMs).


    How do you build LLMs? How can you use them, both in prototype and production settings? What are the building blocks you need to know about?


    ​In this episode, we’ll tell you everything you need to know about LLMs, but were too afraid to ask: from covering the entire LLM lifecycle, what type of skills you need to work with them, what type of resources and hardware, prompt engineering vs fine-tuning vs RAG, how to build an LLM from scratch, and much more.


    The idea here is not that you’ll need to use an LLM you’ve built from scratch, but that we’ll learn a lot about LLMs and how to use them in the process.


    Near the end we also did some live coding to fine-tune GPT-2 in order to create a spam classifier!


    LINKS



    The livestream on YouTube
    Sebastian's website
    Machine Learning Q and AI: 30 Essential Questions and Answers on Machine Learning and AI by Sebastian
    Build a Large Language Model (From Scratch) by Sebastian
    PyTorch Lightning
    Lightning Fabric
    LitGPT
    Sebastian's notebook for finetuning GPT-2 for spam classification!
    The end of fine-tuning: Jeremy Howard on the Latent Space Podcast
    Our next livestream: How to Build Terrible AI Systems with Jason Liu
    Vanishing Gradients on Twitter
    Hugo on Twitter

    • 1 hr 51 min

Customer Reviews

5.0 out of 5
10 Ratings

10 Ratings

vishalthatsme ,

Best data science podcast to come out in a while

[see title]

Top Podcasts In Technology

Acquired
Ben Gilbert and David Rosenthal
All-In with Chamath, Jason, Sacks & Friedberg
All-In Podcast, LLC
Lex Fridman Podcast
Lex Fridman
Hard Fork
The New York Times
The Vergecast
The Verge
TED Radio Hour
NPR

You Might Also Like

Latent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and al
Alessio + swyx
Machine Learning Street Talk (MLST)
Machine Learning Street Talk (MLST)
Dwarkesh Podcast
Dwarkesh Patel
The Ben & Marc Show
Marc Andreessen, Ben Horowitz
No Priors: Artificial Intelligence | Technology | Startups
Conviction | Pod People
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Sam Charrington