
Postgres vs. Elasticsearch: The Unexpected Winner in High-Stakes Search for Instacart with Ankit Mittal
In this episode of The Data Engineering Show, Benjamin Wagner sits down with Ankit Mittal, former Senior Engineer at Instacart, to explore how they revolutionized their search infrastructure by transitioning from Elasticsearch to PostgreSQL. Learn how Instacart tackled the unique challenges of fast-moving grocery inventory, achieved high-performance search capabilities, and leveraged PostgreSQL extensions for complex retrieval operations. Whether you're scaling search functionality or optimizing database performance, this deep dive offers valuable insights into building robust, production-ready search systems using PostgreSQL.
- Discover why Instacart moved from Elasticsearch to PostgreSQL for retailer search
- Learn about handling real-time inventory updates and search optimization
- Explore PostgreSQL extensions, sharding strategies, and data flow architecture
- Understand the trade-offs between different search infrastructure approaches
What You'll Learn:
- How Instacart managed fast-moving grocery inventory data by consolidating search, ranking, and filtering into a single PostgreSQL cluster
- Why pushing compute closer to the data layer can significantly improve search performance and reduce network calls
- The architecture decisions behind using PostgreSQL extensions like PG Vector and custom solutions for search functionality
- How to implement efficient data ingestion through S3-based pipelines and bulk writes instead of real-time updates
- Why table maintenance operations like PGD pack are crucial for optimizing read throughput in production environments
- The trade-offs between traditional search engines and relational databases for complex search implementations
- The challenges of maintaining self-hosted PostgreSQL in a predominantly cloud-managed environment
About the Guest(s)
Ankit is a Software Engineer at ParadeDB and former Senior Engineer at Instacart, where he specialized in PostgreSQL infrastructure and search systems. With extensive experience in database optimization and search architecture, he played a key role in modernizing Instacart's search infrastructure by transitioning from Elasticsearch to a custom PostgreSQL solution. In this episode, Ankit shares deep insights into building and scaling high-performance search systems for e-commerce, particularly focusing on the unique challenges of grocery retail's fast-moving inventory. His work at Instacart revolutionized their single-retailer search functionality, demonstrating how traditional relational databases can be adapted for complex search operations. His expertise in database systems and their practical applications in high-scale environments makes this conversation particularly valuable for engineers interested in modern search architecture and database optimization.
Quotes
"Think about it. If there's a lot of things that you can get the database to do, then the applications become simpler." - Ankit
"My non-Instacart experience has largely been in pre-PMF startups where the approach of abuse your database to its absolute limits works wonders." - Ankit
"Almost everything that we got retrieved had to be filtered out. So we go back to Elasticsearch again." - Ankit
"We traded off the quality of retrieval, hardcore core retrieval, with the whole system reducing the network calls." - Ankit
"It's a place to go to find what item is available, in what store, what item is available, at what price, including full product taxonomy graph and product and ontology." - Ankit
"The grand theme here is that we wanted more control over the cluster, how to spin it off, what kind of disks it would have." - Ankit
"We tell teams who want to have their data in this cluster, create an s3 home, create either a bucket or a home, whatever they want to do, and tell us that we would sync ourselves." - Ankit
"What we found is that the read throughput, we can throw more data if the tables are repacked nicely." - Ankit
"Most engineers who want to work on search, they are more used to the Elasticsearch shape of the query." - Ankit
"The relevance is better because they could join more things in the database. They also saw the cost of the normalized data reduced." - Ankit
Resources
Company Websites:
- Instacart - Grocery delivery platform
- ParadeDB - Database technology company
- Firebolt - Cloud data warehouse (firebolt.io)
Tools & Technologies:
- PostgreSQL - Database system
- Elasticsearch - Search engine
- PG Cat/PG Dog - PostgreSQL proxy tools
- PG Vector - PostgreSQL vector extension
- PG Repack - PostgreSQL table repacking tool
- ClickHouse - Column-oriented DBMS
- TantiVy - Rust-based search engine library
Articles:
- Instacart Search Modernization Blog Posts (Series on hybrid retrieval)
- Target's AlloyDB Migration Blog Post
For Feedback & Discussions on Firebolt Core:
- Join Firebolt Discord Community
- Join Firebolt GitHub Discussions
- Firebolt Core Github Repository
- Benjamin@Firebolt.io
Primary Speakers:
- Ankit Mittal
- Benjamin Wagner
Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.
Check out our three most downloaded episodes:
- Zach Wilson on What Makes a Great Data Engineer
- Joe Reis and Matt Housley on The Fundamentals of Data Engineering
- Bill Inmon, The Godfather of Data Warehousing
信息
- 节目
- 频率两月一更
- 发布时间2025年9月17日 UTC 16:32
- 长度22 分钟
- 单集46
- 分级儿童适宜