Hello everyone and welcome to the very first season of Modern Data Show- a podcast where you can hear stories from data practitioners and enthusiasts - the folks in the arena, on their journey of building and operating a modern data stack. I’m your host - Aayush and we’re super happy you’re here.
In this first season of our show, we’d be talking about all things ETL - how organisations at a large scale manage their ETL processes, the tools and technologies they use, common pitfalls and much more
S02 E15 From Data Source to API in Minutes with Matteo Pelati and Vivek Gudapuri, founders at Dozer
Prepare to be amazed in this episode as Matteo Pelati and Vivek Gudapuri, the brilliant minds behind Dozer, reveal their experience in pushing the boundaries of data management and analysis. By simplifying the process of data serving and allowing companies to create APIs quickly and efficiently, Dozer's approach sets them apart from the modern data stack. Their open-source approach allows developers to build custom operators and extend connectors, ensuring that Dozer can cover a wide range of use cases while still offering customization at each step. They also discuss the challenges they faced during the development of Dozer and how they are positioned to adapt to upcoming trends and developments in real-time data processing.
S02 E14 Transforming Data Pipelines for the Future: An Interview with Sean Knapp, CEO of Ascend.io
Uncover the secret to turning data engineering into a superpower! As Sean Knapp, the CEO and founder of Ascend.io, joined us and discussed the value of depth and breadth in capturing the entire data value chain, emphasizing the need for an automation layer to adapt to the evolving data landscape. Ascend's platform enables intelligent data pipeline creation and management, with a dynamic control plane that detects and responds to changes in real time across extensive pipeline networks. Sean further explored the potential of generative AI in data engineering & his optimism about the future of the modern data stack, foreseeing consolidation and the emergence of new parallel spaces in the data ecosystem.
S02 E13 Building a Data-Driven Fashion Empire: The Zalando Data Foundation Story with Dr. Alexander Borek, Director of Data and Analytics at Zalando
Step into the world of Zalando, Europe's leading online fashion retailer, where data drives innovation and enhances the customer experience. In this episode, join us as we interview Dr. Alexander Borek, the brilliant mind behind Zalando's data and analytics strategy. Discover how Dr. Borek and his team have revolutionized the company's approach to data by implementing the cutting-edge concept of data mesh. Learn how Zalando successfully strikes the perfect balance between decentralization and structure, unleashing the full potential of data while maintaining collaboration with various business units. Dr. Borek also unveils the secrets to leveraging data for innovation and value creation in the dynamic world of online fashion. Tune in now for an eye-opening exploration of data management, leadership, and the future of data-driven decision-making at Zalando.
S02 E12 Unveiling Twilio’s Data Transformation: A Journey into Modern Data Stack with Don Oriti, Head of Data Platform and Engineering at Twilio
Twilio has built an open source data lake using AWS technologies and Databricks, processing billions of events daily through their Kafka environment. They aim to provide a cohesive view of data across platforms and enable other businesses to use data wherever they want. Don, the Head of Data Platform and Engineering at Twilio, shares insights into Twilio's data stack in the latest episode of the Modern Data Show. The conversation covers the Twilio data stack, which begins with data ingestion through Kafka or CDC for Aurora databases, followed by storage in S3, high-level aggregation and curation using Spark, and the use of tools such as Kudu, Reverse ETL, data governance, cataloging, and BI tools.
S02 E11: The Reverse ETL Revolution: Overcoming Challenges in Syncing Live Data to SaaS Tools with Tejas Manohar, Co-founder and Co-CEO at Hightouch
Did your business ever face challenges to sync live data to your sales, marketing, and customer success tools? Then this is where you need Hightouch, a Reverse ETL platform that syncs data from a data warehouse to SaaS tools in minutes. It enables businesses to get accurate customer data quickly without requiring engineering effort or manual work. In this episode, Tejas Manohar shared his journey from developing games at a young age to becoming the Co-founder and CEO of Hightouch. He provided valuable insights into Hightouch's internal connector framework, which automatically performs tasks like change data capture and batching, as well as providing methods to send rows that may need to be retried in future syncs. He also talked about Hightouch's two new products and the future of reverse ETL.
S02 E10: From On-Prem to the Cloud: Managing ClickHouse with DoubleCloud with Natalia Shuliak, COO at DoubleCloud
When working with open-source technologies, you benefit from the community's creations, but you also have to do a lot of admin and support work as the technologies tend to break, and support usually falls on yourself. This is where DoubleCloud's platform comes into the picture. In this latest episode of the Modern Data Show, Natalia Shuliak talks about how DoubleCloud saves you from administrative work and allows you to focus on data pipeline development and management, while providing backup, security, and support.