347 episodes

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Data Engineering Podcast Tobias Macey

    • Technology
    • 4.7 • 111 Ratings

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

    Supporting And Expanding The Arrow Ecosystem For Fast And Efficient Data Processing At Voltron Data

    Supporting And Expanding The Arrow Ecosystem For Fast And Efficient Data Processing At Voltron Data

    The data ecosystem has been growing rapidly, with new communities joining and bringing their preferred programming languages to the mix. This has led to inefficiencies in how data is stored, accessed, and shared across process and system boundaries. The Arrow project is designed to eliminate wasted effort in translating between languages, and Voltron Data was created to help grow and support its technology and community. In this episode Wes McKinney shares the ways that Arrow and its related projects are improving the efficiency of data systems and driving their next stage of evolution.

    • 50 min
    Analyze Massive Data At Interactive Speeds With The Power Of Bitmaps Using FeatureBase

    Analyze Massive Data At Interactive Speeds With The Power Of Bitmaps Using FeatureBase

    The most expensive part of working with massive data sets is the work of retrieving and processing the files that contain the raw information. FeatureBase (formerly Pilosa) avoids that overhead by converting the data into bitmaps. In this episode Matt Jaffee explains how to model your data as bitmaps and the benefits that this representation provides for fast aggregate computation. He also discusses the improvements that have been incorporated into FeatureBase to simplify integration with the rest of your data stack, and the SQL interface that was added to make working with the product easier.

    • 59 min
    A Look At The Data Systems Behind The Gameplay For League Of Legends

    A Look At The Data Systems Behind The Gameplay For League Of Legends

    The majority of blog posts and presentations about data engineering and analytics assume that the consumers of those efforts are internal business users accessing an environment controlled by the business. In this episode Ian Schweer shares his experiences at Riot Games supporting player-focused features such as machine learning models and recommeder systems that are deployed as part of the game binary. He explains the constraints that he and his team are faced with and the various challenges that they have overcome to build useful data products on top of a legacy platform where they don't control the end-to-end systems.

    • 1 hr 1 min
    Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet

    Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet

    The problems that are easiest to fix are the ones that you prevent from happening in the first place. Sifflet is a platform that brings your entire data stack into focus to improve the reliability of your data assets and empower collaboration across your teams. In this episode CEO and founder Salma Bakouk shares her views on the causes and impacts of "data entropy" and how you can tame it before it leads to failures.

    • 46 min
    Build Data Products Without A Data Team Using AgileData

    Build Data Products Without A Data Team Using AgileData

    Building data products is an undertaking that has historically required substantial investments of time and talent. With the rise in cloud platforms and self-serve data technologies the barrier of entry is dropping. Shane Gibson co-founded AgileData to make analytics accessible to companies of all sizes. In this episode he explains the design of the platform and how it builds on agile development principles to help you focus on delivering value.

    • 1 hr 12 min
    Taking A Look Under The Hood At CreditKarma's Data Platform

    Taking A Look Under The Hood At CreditKarma's Data Platform

    CreditKarma builds data products that help consumers take advantage of their credit and financial capabilities. To make that possible they need a reliable data platform that empowers all of the organization's stakeholders. In this episode Vishnu Venkataraman shares the journey that he and his team have taken to build and evolve their systems and improve the product offerings that they are able to support.

    • 52 min

Customer Reviews

4.7 out of 5
111 Ratings

111 Ratings

SteveT3ch ,

Best Data Engineering Podcast

Found this podcast by accident and now can’t do without it. Very knowledgeable host and guesses

LisaIsHereForIt ,

Incredible insights!💥

No matter the topic, you’re guaranteed to gain something from every episode - can’t recommend Data Engineering enough. 🙌

ASobering ,

Such a wealth of knowledge! 🧠

Got a question about anything “data engineering?”

Tobias has got you covered. 😎

Whether you’re well established as an engineer, or just getting started in your career, this is a must-listen podcast for you! Tobias does an incredible job leading engaging conversations with industry leaders who’ve actually experienced success themselves and every. single. episode. is jam-packed with helpful takeaways. Highly recommend listening and subscribing!

Top Podcasts In Technology

Lex Fridman
Jason Calacanis
The New York Times
NPR
Ben Gilbert and David Rosenthal
Jack Rhysider

You Might Also Like

Michael Kennedy (@mkennedy)
Kyle Polich
Jon Krohn and Guests on Machine Learning, A.I., and Data-Career Success
DataCamp
Michael Kennedy and Brian Okken
Changelog Media