Leveraging Airflow To Build Scalable and Reliable Data Platforms at 99acres.com with Samyak Jain

The Data Flowcast: Mastering Airflow for Data Engineering & AI

Data orchestration is evolving rapidly, with dynamic workflows becoming the cornerstone of modern data engineering. In this episode, we are joined by Samyak Jain, Senior Software Engineer - Big Data at 99acres.com. Samyak shares insights from his journey with Apache Airflow, exploring how his team built a self-service platform that enables non-technical teams to launch data pipelines and marketing campaigns seamlessly.

Key Takeaways:

(02:02) Starting a career in data engineering by troubleshooting Airflow pipelines.

(04:27) Building self-service portals with Airflow as the backend engine.

(05:34) Utilizing API endpoints to trigger dynamic DAGs with parameterized templates.

(09:31) Managing a dynamic environment with over 1,400 active DAGs.

(11:14) Implementing fault tolerance by segmenting data workflows into distinct layers.

(14:15) Tracking and optimizing query costs in AWS Athena to save $7K monthly.

(16:22) Automating cost monitoring with real-time alerts for high-cost queries.

(17:15) Streamlining Airflow metadata cleanup to prevent performance bottlenecks.

(21:30) Efficiently handling one-time and recurring marketing campaigns using Airflow.

(24:18) Advocating for Airflow features that improve resource management and ownership tracking.

Resources Mentioned:

Samyak Jain -

https://www.linkedin.com/in/samyak-jain-ab5830169/

99acres.com -

https://www.linkedin.com/company/99acres/

Apache Airflow -

https://airflow.apache.org/

AWS Athena -

https://aws.amazon.com/athena/

Kafka -

https://kafka.apache.org/

Thanks for listening to “The Data Flowcast: Mastering Airflow for Data Engineering & AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

#AI #Automation #Airflow #MachineLearning

若要收聽兒少不宜的單集,請登入帳號。

隨時掌握此節目最新消息

登入或註冊後,即可追蹤節目、儲存單集和掌握最新資訊。

選取國家或地區

非洲、中東和印度

亞太地區

歐洲

拉丁美洲與加勒比海地區

美國與加拿大