241 episodes

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Data Engineering Podcast Tobias Macey

    • Technology
    • 5.0 • 1 Rating

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

    Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

    Exploring Processing Patterns For Streaming Data Integration In Your Data Lake

    One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near real time, but it can be challenging to understand the proper processing patterns to make that performant. In this episode Ori Rafael shares his experiences from Upsolver and building scalable stream processing for integrating and analyzing data, and what the tradeoffs are when coming from a batch oriented mindset.

    • 52 min
    Laying The Foundation Of Your Data Platform For The Era Of Big Complexity With Dagster

    Laying The Foundation Of Your Data Platform For The Era Of Big Complexity With Dagster

    The technology for scaling storage and processing of data has gone through massive evolution over the past decade, leaving us with the ability to work with massive datasets at the cost of massive complexity. Nick Schrock created the Dagster framework to help tame that complexity and scale the organizational capacity for working with data. In this episode he shares the journey that he and his team at Elementl have taken to understand the state of the ecosystem and how they can provide a foundational layer for a holistic data platform.

    • 1 hr 5 min
    Data Quality Starts At The Source

    Data Quality Starts At The Source

    The most important gauge of success for a data platform is the level of trust in the accuracy of the information that it provides. In order to build and maintain that trust it is necessary to invest in defining, monitoring, and enforcing data quality metrics. In this episode Michael Harper advocates for proactive data quality and starting with the source, rather than being reactive and having to work backwards from when a problem is found.

    • 58 min
    Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata

    Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata

    A significant source of friction and wasted effort in building and integrating data management systems is the fragmentation of metadata across various tools. After experiencing the impacts of fragmented metadata and previous attempts at building a solution Suresh Srinivas and Sriharsha Chintalapani created the OpenMetadata project. In this episode they share the lessons that they have learned through their previous attempts and the positive impact that a unified metadata layer had during their time at Uber. They also explain how the OpenMetadat project is aiming to be a common standard for defining and storing metadata for every use case in data platforms and the ways that they are architecting the reference implementation to simplify its adoption. This is an ambitious and exciting project, so listen and try it out today.

    • 1 hr 6 min
    Business Intelligence Beyond The Dashboard With ClicData

    Business Intelligence Beyond The Dashboard With ClicData

    Business intelligence is often equated with a collection of dashboards that show various charts and graphs representing data for an organization. What is overlooked in that characterization is the level of complexity and effort that are required to collect and present that information, and the opportunities for providing those insights in other contexts. In this episode Telmo Silva explains how he co-founded ClicData to bring full featured business intelligence and reporting to every organization without having to build and maintain that capability on their own. This is a great conversation about the technical and organizational operations involved in building a comprehensive business intelligence system and the current state of the market.

    • 1 hr 2 min
    Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL

    Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL

    The precursor to widespread adoption of cloud data warehouses was the creation of customer data platforms. Acting as a centralized repository of information about how your customers interact with your organization they drove a wave of analytics about how to improve products based on actual usage data. A natural outgrowth of that capability is the more recent growth of reverse ETL systems that use those analytics to feed back into the operational systems used to engage with the customer. In this episode Tejas Manohar and Rachel Bradley-Haas share the story of their own careers and experiences coinciding with these trends. They also discuss the current state of the market for these technological patterns and how to take advantage of them in your own work.

    • 1 hr 2 min

Customer Reviews

5.0 out of 5
1 Rating

1 Rating

Top Podcasts In Technology

You Might Also Like