15 episodes

A podcast about Apache Airflow, an open source workflow management system that lets you define data pipelines in python. Produced with love by the team at Astronomer.

The Airflow Podcast Astronomer

    • Technology

A podcast about Apache Airflow, an open source workflow management system that lets you define data pipelines in python. Produced with love by the team at Astronomer.

    Expanding the Data Engineering Toolkit at Reddit

    Expanding the Data Engineering Toolkit at Reddit

    Welcome back to the Airflow Podcast.

    This week, we met up with Ben Wisegarver, a staff data scientist at Reddit who runs their data warehousing and data engineering functions.

    Reddit users generate petabytes of data every day that needs to be processed, stored, and analyzed by a wide breadth of backend services. Our conversation with Ben touches on everything from Airflow as a tool for career mobility across the data stack to scaling out a self-service data architecture across many teams.

    For folks interested, our team at Astronomer is growing rapidly and we're on the hunt for new folks to join in a variety of different roles. If you're passionate about Airflow and interested in building the future of data engineering, please get in touch. You can check our current job postings at careers.astronomer.io, but we're constantly updating our listings to accommodate new hiring needs. Please feel free to email me directly at pete@astronomer.io if you're passionate about what we're doing and think you'd be a good addition to the team.

    Mentioned Resources:

    Careers: https://careers.astronomer.io

    Guest Profile:

    Ben Wisegarver: https://www.linkedin.com/in/ben-wisegarver-54566576

    • 45 min
    GDPR, Self-Service Data, and Infrastructure Automation with Typeform

    GDPR, Self-Service Data, and Infrastructure Automation with Typeform

    Welcome back to the Airflow Podcast.

    This week, we met up with Albert Franzi and Carlos Escura from Typeform. Typeform is a tool that allows you to build beautiful interactive forms that you can use for a wide variety of use cases, including customer surveys, employee engagement, product feedback, and market research to name a few. In our conversation, we discussed Airflow as a tool for GDPR compliance, the concept of self-service data and how it allows your data operations team to function as a data platform team, and some of the more specialized infrastructure tooling that the Typeform team has built out to support their internal teams.

    For folks interested, our team at Astronomer is growing rapidly and we're on the hunt for new folks to join in a variety of different roles. If you're passionate about Airflow and interested in building the future of data engineering, please get in touch. You can check our current job postings at careers.astronomer.io, but we're constantly updating our listings to accommodate new hiring needs. Please feel free to email me directly at pete@astronomer.io if you're passionate about what we're doing and think you'd be a good addition to the team.

    Mentioned Resources:
    Dag Factory: https://github.com/ajbosco/dag-factory
    Astronomer Careers: https://careers.astronomer.io

    Guest Profiles:
    Albert Franzi: https://www.linkedin.com/in/albertfranzi/?originalSubdomain=es
    Carlos Escura: https://www.linkedin.com/in/carlosescura/en-us/

    • 31 min
    Adopting Airflow at Netlify

    Adopting Airflow at Netlify

    After a bit of a break, we're back with the third official episode bundle of The Airflow Podcast. In this batch, we'll get a little bit deeper with current Airflow users and maintainers on core fundamental concepts in data engineering, architectures for operating modern data platforms at scale, and the process of maintaining and operating Airflow, specifically as we go through the release process of Airflow 2.0.

    This week, we met up with Brian de la Motte and Florian Hines at Netlify. Netlify provides an extremely popular toolset for building and deploying JAMstack sites. They provide hosting services, CI, DNS, authentication, and managed backend tools that help users run and operate static sites at scale. The team over there recently adopted Airflow to help decouple orchestration logic from a complex collection Spark jobs and are currently in the process of expanding their Airflow footprint to accommodate a broader group of interesting use-cases.

    Disclaimer: we get a bit of a surprise about halfway through the episode when Brian tells us that they had recently signed up for Astronomer- we promise that it wasn't a planted ad :).

    Please contact pete@astronomer.io if you'd like to get in touch regarding future episodes. Hope you enjoy!

    Guest Profiles:
    Brian de la Motte: https://www.linkedin.com/in/brian-de-la-motte/
    Florian Hines: https://www.linkedin.com/in/florianhines/

    • 28 min
    The Road to Airflow 2.0

    The Road to Airflow 2.0

    This week, we linked up with Airflow release manager, core committer, and Astronomer platform engineer Ash Berlin-Taylor to discuss the Airflow 2.0 roadmap [1]. There is some great stuff in the works around performance, autoscaling, and usability that we're excited about. In this episode, Ash lends his thoughts on the design, implementation, and value-add around all of the upcoming features, including:
    - The Knative Executor
    - A modern and real-time UI
    - A production-grade API
    - Improved scheduler and webserver performance
    - An official production Docker image for Airflow

    We hope you enjoy! Please email pete@astronomer.io if you have thoughts on topics you'd like to see covered in future episodes.

    Separately, some good folks from the Airflow community are running a user survey that will help collect some useful information around the Airflow UX. If you have five minutes to spare, filling out the following form will help the core Airflow committers to shape the project roadmap: https://forms.gle/XAzR1pQBZiftvPQM7

    [1] https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+2.0

    • 32 min
    Airflow Breeze

    Airflow Breeze

    This week, we had the pleasure of meeting up with Jarek Potiuk, Principal Software Engineer at Polidea and Apache Airflow committer, to discuss his most recent contribution to the community, Airflow Breeze. Jarek deeply values developer productivity and realized while building a team of Airflow committers that, in order to open a PR on the project, passing unit tests and waiting for the CI build was a cumbersome process that could take up to a few hours. Breeze seeks to improve that experience for Airflow committers and lower the barrier-to-entry of contribution for folks that are new to the open-source community.

    You can read more about Airflow Breeze here: https://www.polidea.com/blog/its-a-breeze-to-develop-apache-airflow/#the-apache-airflow-projects-setup

    • 46 min
    Open Source and Airflow at Google

    Open Source and Airflow at Google

    This episode kicks off season 2 of The Airflow Podcast. In this next season, we'll focus on the future of Airflow and chat with leading members of the community to paint a picture of what's to come. We're pumped to be diving back into this project and look forward to the great conversations we have lined up.

    This week, we chatted with James Malone, Product Manager of Google's Cloud Composer. James had some interesting things to say about open source at Google and where his team plans on contributing most to the project going forward.

    As always, thanks for listening and please email pete@astronomer.io if you have any feedback or would like to be considered as a guest.

    • 39 min

Top Podcasts In Technology

Acquired
Ben Gilbert and David Rosenthal
Lex Fridman Podcast
Lex Fridman
Fornybaren
Fornybar Norge
Dwarkesh Podcast
Dwarkesh Patel
Darknet Diaries
Jack Rhysider
Hard Fork
The New York Times