135 episodes

Weekly talks and fireside chats about everything that has to do with the new space emerging around DevOps for Machine Learning aka MLOps aka Machine Learning Operations.

MLOps.community Demetrios Brinkmann

    • Technology
    • 5.0 • 1 Rating

Weekly talks and fireside chats about everything that has to do with the new space emerging around DevOps for Machine Learning aka MLOps aka Machine Learning Operations.

    Data Quality Over Quantity or Data Selection for Data-Centric AI // Cody Coleman // Coffee Sessions #59

    Data Quality Over Quantity or Data Selection for Data-Centric AI // Cody Coleman // Coffee Sessions #59

    Coffee Sessions #59 with Cody Coleman, Data Quality Over Quantity or Data Selection for Data-Centric AI.

    // Abstract
    Big data has been critical to many of the successes in ML, but it brings its own problems. Working with massive datasets is cumbersome and expensive, especially with unstructured data like images, videos, and speech. Careful data selection can mitigate the pains of big data by focusing computational and labeling resources on the most valuable examples.  

    Cody Coleman, a recent Ph.D. from Stanford University and founding member of MLCommons, joins us to describe how a more data-centric approach that focuses on data quality rather than quantity can lower the AI/ML barrier. Instead of managing clusters of machines and setting up cumbersome labeling pipelines, you can spend more time tackling real problems.

    // Bio
    Cody Coleman recently finished his Ph.D. in CS at Stanford University, where he was advised by Professors Matei Zaharia and Peter Bailis. His research spans from performance benchmarking of hardware and software systems (i.e., DAWNBench and MLPerf) to computationally efficient methods for active learning and core-set selection. His work has been supported by the NSF GRFP, the Stanford DAWN Project, and the Open Phil AI Fellowship.

    // Relevant
    Links [preprint] Similarity Search for Efficient Active Learning and Search of Rare Concepts: [https://arxiv.org/abs/2007.00077](https://arxiv.org/abs/2007.00077)

    [video] Similarity Search for Efficient Active Learning and Search of Rare Concepts: [https://www.youtube.com/watch?v=vRVyOEK2JUU](https://www.youtube.com/watch?v=vRVyOEK2JUU)

    [blog post] Selection via Proxy: Efficient Data Selection for Deep Learning: [https://dawn.cs.stanford.edu/2020/04/23/selection-via-proxy/](https://dawn.cs.stanford.edu/2020/04/23/selection-via-proxy/)

    [slides] The DAWN of MLPerf: [https://drive.google.com/file/d/17ZpX0GOtOXG8QMn6KEc_Le8tUfDBlgDE/view](https://drive.google.com/file/d/17ZpX0GOtOXG8QMn6KEc_Le8tUfDBlgDE/view)

    [blog post] About Cody's research: [https://hai.stanford.edu/news/cody-coleman-lowering-machine-learnings-barriers-help-people-tackle-real-problems](https://hai.stanford.edu/news/cody-coleman-lowering-machine-learnings-barriers-help-people-tackle-real-problems)

    [video] About Cody: [https://www.youtube.com/watch?v=stxJMsxxxtA](https://www.youtube.com/watch?v=stxJMsxxxtA)

    --------------- ✌️Connect With Us ✌️ -------------
    Join our slack community: https://go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: https://go.mlops.community/register

    Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
    Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
    Connect with Cody on LinkedIn: https://www.linkedin.com/in/codyaustun/

    • 1 hr 11 min
    10 Types of Features your Location ML Model is Missing // Anne Cocos // Coffee Sessions #58

    10 Types of Features your Location ML Model is Missing // Anne Cocos // Coffee Sessions #58

    Coffee Sessions #58 with Anne Cocos, 10 Types of Features your Location ML Model is Missing.

    // Abstract
    Machine learning on geographic data is relatively under-studied in comparison to ML on other formats like images or graphs. But geographic data is prevalent across a wide variety of domains (although many practitioners may not think of it that way). Clearly, any dataset with `latitude` and `longitude` columns can be viewed as geographic data, but also any dataset with a `zipcode`, `city`, `address`, or `county` can be construed as geographic. Demographics, weather, foot traffic, points of interest, and topographic features can all be used to enrich a dataset with any of these types of keys.

    Incorporating relatively straightforward geographic features into models can yield substantial improvements; adding "distance to the beach" or "square mileage reachable within 10 min drive" to a real estate pricing model, for example, can lead to significant decreases in model error.

    Unfortunately, many ML teams find it difficult to incorporate these types of geographic data into their models because the process of ingesting from geographic formats (geojson or shapefiles), projecting, and properly joining with their existing data can be a large infrastructure lift.

    In this coffee session, Anne discusses ways to simplify the process of incorporating geographic or location data into the MLOps workflow, as well as interesting trends in the geographic ML research community that will ultimately make it easier for us to learn from geography just as we do with images or graphs today.

    // Bio
    Dr. Anne Cocos currently leads data science and machine learning at Ask Iggy, Inc., a venture-backed, seed round startup focused on location analytics. Her team builds tools that make it simple for data scientists to leverage location information in their models and analyses. Previously she was the Director and Head, NLP and Knowledge Graph at GlaxoSmithKline, where she built algorithms and infrastructure to enable GSK’s scientists to leverage all the world’s written biomedical knowledge for drug discovery. She also worked on applied natural language processing research at The Children’s Hospital of Philadelphia Department of Biomedical Informatics. Anne completed her Ph.D. in computer science at the University of Pennsylvania, where she was supported by the Google Ph.D. Fellowship and the Allen Institute for Artificial Intelligence Key Scientific Challenges award.

    Before shifting her career toward artificial intelligence, Anne spent several years as an end-user of early ML-powered technologies in the U.S. Navy and at HelloWallet. Her previous degrees are from the U.S. Naval Academy, Royal Holloway University of London, and Oxford University. She currently lives just outside Philadelphia with her husband and three boys.

    --------------- ✌️Connect With Us ✌️ -------------
    Join our slack community: https://go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: https://go.mlops.community/register

    Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
    Connect with Anne on LinkedIn: https://www.linkedin.com/in/annecocos/

    • 55 min
    The Future of ML and Data Platforms // Michael Del Balso - Erik Bernhardsson // CoffeeSessions #57

    The Future of ML and Data Platforms // Michael Del Balso - Erik Bernhardsson // CoffeeSessions #57

    Coffee Sessions #57 with Michael Del Balso and Erik Bernhardsson, The Future of ML and Data Platforms.

    // Abstract
    Machine learning, data analytics, and software engineering are converging as data-intensive systems become more ubiquitous.  Erik Bernhardsson, ex-CTO at Better and former Spotify machine learning lead, and Mike Del Balso, CEO at Tecton and former Uber machine learning lead and co-creator of Michelangelo sit down to chat with us today.   

    These two jammed with us about building machine learning platform systems and teams, the modern operational data stack and how it allows more machine learning applications to thrive, and how to successfully take advantage of data in the process of building products and companies.


    // Bio
    Michael Del Balso
    Mike is the co-founder of Tecton, where he is focused on building next-generation data infrastructure for Operational ML. Before Tecton, Mike was the PM lead for the Uber Michelangelo ML platform. He was also a product manager at Google where he managed the core ML systems that power Google’s Search Ads business. Previous to that, he worked on Google Maps. He holds a BSc in Electrical and Computer Engineering summa c*m laude from the University of Toronto.

    Erik Bernhardsson
    Erik is currently working on some crazy data stuff since early 2021 but previously spent 6 years as the CTO of Better.com, growing the tech team from 1 to 300. Before Better, Erik spent 6 years at Spotify, building the music recommendation system and managing a team focused on machine learning.

    // Relevant Links
    Building a Data Team at a Mid-stage Startup: A Short Story
    https://erikbern.com/2021/07/07/the-data-team-a-short-story.html

    --------------- ✌️Connect With Us ✌️ -------------
    Join our slack community: https://go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: https://go.mlops.community/register

    Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
    Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
    Connect with Mike on LinkedIn: https://www.linkedin.com/in/michaeldelbalso/
    Connect with Erik on LinkedIn: https://www.linkedin.com/in/erikbern

    Timestamps:
    [01:12] Introduction to Michael Del Balso and Erik Bernhardsson
    [03:23] High-level space in data
    [07:25] Complexity in the data world
    [09:13] Data lake + data bricks
    [15:20] Platform strategy
    [16:05] "Platform is when the economic value of everybody that uses this exceeds the value of the company that creates it." - Bill Gates
    [18:17] Centralizing platforms
    [21:06] Team spin up centralization or decentralization
    [27:18] Manifestations of being too far from a centralized and decentralized platform
    [29:24] Centralized vs Decentralized
    [33:33] Platform value and appropriate sizing
    [35:43] Building a Data Team at a Mid-stage Startup: A Short Story blog post by Erik Bernhardsson
    [38:51] Machine Learning as a sub-problem of Data
    [42:16] Operational ML
    [46:30] Spotify recommendations
    [47:13] Real-time data flows at Spotify
    [49:40] Data stack, Machine Learning stack, and Back-end stack reusability
    [51:40] Container management

    • 55 min
    A Few Learnings from Building a Bootstrapped MLOps Services Startup //Soumanta Das// Coffee Sessions #56

    A Few Learnings from Building a Bootstrapped MLOps Services Startup //Soumanta Das// Coffee Sessions #56

    Soumanta wouldn't claim they've reached where they want to and they're still learning, so he's happy sharing successes as well as failures at Yugen.ai.

    // Abstract
    Determining Minimum Achievable Goals helps Yugen.ai ensure a significant amount of focus on value-added and impact before diving deep into solutions & building ML Systems. In this episode, Soumanta discusses Balancing ML Development vs Ops and Monitoring efforts while scaling plus their focus on improvements in small sprints.

    Soumanta wouldn't claim they've reached where they want to and they're still learning, so he's happy sharing successes as well as failures at Yugen.ai.

    // Bio
    Soumanta is a Co-founder at Yugen.ai, an early-stage startup in the Data Science and MLOps space.

    We imagine the future to be shaped by the convergence and simultaneous adoption of Algorithms, Engineering and Ops, and Responsible AI. Our mission is to help effectuate and expedite the same for our client partners by creating large-scale, reliable, and personalized ML Systems.

    // Relevant Links
    A blog Soumanta wrote when Yugen turned one https://medium.com/swlh/yugen-ai-turns-one-1089f3bf169

    Presentation, ML REPA 2021 Title of the Talk - Reducing the distance between Prototyping and Production, Why obsessing over experimentation and iteration compounds ROIs

    Slides - https://drive.google.com/file/d/1J9Cv6IPPkGpOTq8Xl_AQCKaR0-pKMUmA/view?usp=sharing  
    Video - https://youtu.be/4PEbgQTw1W0

    --------------- ✌️Connect With Us ✌️ -------------
    Join our slack community: https://go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: https://go.mlops.community/register

    Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
    Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
    Connect with Soumanta on LinkedIn: www.linkedin.com/in/soumanta-das/

    Timestamps:
    [00:00] Introduction to Soumanta Das
    [00:24] What's Yugen.ai's name all about?
    [02:02] Starting during the pandemic
    [05:13] Determination to continue during the pandemic
    [08:02] State of the art in Yugen.ai and its future
    [11:32] Time to value defining ML to a business
    [13:01] Building a strong ML engineering culture
    [19:06] Data scientists patterns  
    [20:00] Helper functions  
    [22:45] Code review
    [25:32] Repeatable use cases
    [27:48] Minimum achievable goals
    [30:30] Production management goals
    [34:30] Use cases and System design document
    [36:20] Practices that helped Yugen.ai build ML systems  
    [40:05] Growing pains in the scaling process
    [43:54] Yugen.ai war stories
    [46:50] Unrealizing there's something wrong and there's actually something wrong
    [48:10] Data observability tools
    [49:42] Hands-on deck

    • 52 min
    Learning and Teaching MLOps Applications // Salwa Muhammad // MLOps Coffee Sessions #55

    Learning and Teaching MLOps Applications // Salwa Muhammad // MLOps Coffee Sessions #55

    Coffee Sessions #55 with Salwa Muhammad, Learning and Teaching MLOps Applications.  

    //Abstract
    Salwa shared her perspective on how FourthBrain and all learners can keep their education strategy fresh enough for the current zeitgeist. Furthermore, Salwa, Demetrios, and Vishnu talked about principles of effective learning that are important to keep in mind while embarking on any educational journey.  

    This was a great conversation with a lot of practical tips that we hope you all listen to!

    // Bio
    Salwa Nur Muhammad is the Founder/CEO of FourthBrain, an AI/ML education startup backed by Andrew Ng's AI Fund. FourthBrain trains Machine Learning engineers through a hybrid 2-3 month cohort-based programs that combine accountability of weekly instructor-led live sessions with the flexibility of online content.

    Salwa founded FourthBrain after executive leadership roles at Udacity and Trilogy Education Services (acquired by 2U Inc).   

    She has over 10 years of experience leveraging technology to develop scalable education programs at higher-ed institutions and ed-tech companies, building new business units, launching international programs, and hiring and training cross-functional teams.

    // Relevant Links
    https://www.fourthbrain.ai/

    --------------- ✌️Connect With Us ✌️ -------------
    Join our slack community: https://go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: https://go.mlops.community/register

    Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
    Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
    Connect with Salwa on LinkedIn: https://www.linkedin.com/in/salwanur/

    Timestamps:
    [00:00] Introduction to Salwa Muhammad
    [01:20] Salwa's journey in tech
    [05:30] Advice to new ML engineers
    [10:21] Curriculum development process
    [17:36] FourthBrain's current status and what's next
    [21:53] Hardest piece in the course
    [24:49] Knowing the right job in a role confused world
    [30:05] Needing to upskill without going insane
    [35:10] Generalist vs Specialist on T-shaped Analogy
    [41:15] Counseling learners in terms of long-term progression
    [43:00] MLOps trajectories recommendation

    • 48 min
    Machine Learning SRE // Niall Murphy // MLOps Coffee Sessions #54

    Machine Learning SRE // Niall Murphy // MLOps Coffee Sessions #54

    Coffee Sessions #54 with Niall Murphy, Machine Learning SRE.

    //Abstract
    SRE is making its way into the machine learning world. Software engineering for machine learning requires reliability, performance, and maintainability. Site reliability engineering is the field that deals with reliability and ensuring constant, real-time performance. Niall Murphy, most recently Global Head of SRE at Microsoft Azure, helps us understand what SRE can do for modern ML products and teams.

    Building machine learning teams requires a diverse set of technical experiences, and Niall shares his thoughts on how to do that most effectively. Machine learning organizations need to start to take advantage of SRE best practices like SLOs, which Niall walks through. Production machine learning depends on high-quality software engineering, and we get Niall's take on how to ensure that in a machine learning context.

    // Bio
    Niall Murphy has been interested in Internet infrastructure since the mid-1990s. He has worked with all of the major cloud providers from their Dublin, Ireland offices - most recently at Microsoft, where he was global head of Azure Site Reliability Engineering (SRE). His books have sold approximately a quarter of a million copies worldwide, most notably the award-winning Site Reliability Engineering, and he is probably one of the few people in the world to hold degrees in Computer Science, Mathematics, and Poetry Studies. He lives in Dublin, Ireland, with his wife and two children.

    --------------- ✌️Connect With Us ✌️ -------------
    Join our slack community: https://go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: https://go.mlops.community/register

    Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
    Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
    Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/
    Connect with Niall on LinkedIn: https://www.linkedin.com/in/niallm/

    Timestamps:
    [00:00] Introduction to Niall Murphy
    [00:36] SRE background to Machine Learning space transition
    [07:10] SLO's being a challenge in the ML space
    [09:42] SRE Hiring Investments
    [15:10] Behavior of teams concept
    [17:45] Challenges dealing with ML production
    [18:27] Update on Reliable Machine Learning book
    [22:46] Monitoring
    [25:05] Difference between ML and SRE
    [29:18] Incident response in Machine Learning
    [34:46] Rollbacks
    [35:50] Machine Learning burden overtime
    [42:42] Niall's journey to the SRE space and focus to develop himself

    • 48 min

Customer Reviews

5.0 out of 5
1 Rating

1 Rating

Top Podcasts In Technology

You Might Also Like