TopicPartition

2minutestreaming

0.0 (0)
Technology

A in-depth engineering podcast about Apache Kafka

Jun 15

in 2026, we can't afford schemaless

Introducing Omnigraph - an open source, schema-first, lakehouse-native context graph engine built on Rust, Lance, Apache Arrow and DataFusion. It is S3-native infrastructure built to solve the problem of agent coordination.Omnigraph caught my attention so I brought the two authors to the podcast - Andrew Altshuler and Ragnor Comerford. It was a very densely-packed conversation where we covered everything around modern day AI engineering, including:- why a schema-first is a must for agents- the importance of guardrails in autonomous workloads (schemas, tests, linters, policies, and type systems)- what Omnigraph is, and how its being built/dogfooded at Modern Relay / how Slack + Linear start to break down with many parallel agents- how Lance, DataFusion, Arrow, and object storage fit perfectly into the open AI data stack- why proper agent coordination looks more like decentralized decision making than central planning- the importance of good taste (when agents can implement almost anything)- why AI massively compresses business timelines- and a lot more0:00 Why build a graph database for agents?5:43 Why not Postgres? (or any relational database)17:03 The composable "company brain" substrate for agents20:51 Need for guardrails for agents (eg type safety)27:00 Importance of Schemas33:48 NoSQL vs SQL42:46 Lance, DataFusion, and Arrow as the open stack51:00 What Modern Relay and OmniGraph are52:13 Branches: GitHub for agent-written data1:00:59 Slack Agents, the Dependency Graph and decoupling for parallelization1:12:32 Why Graphs are great and a 2yr prediction1:17:32 Centralization vs decentralization for long-horizon coordination------------------------------------MODERN RELAYhttps://modernrelay.com/ANDREW• GitHub: https://github.com/aaltshuler• X: https://x.com/1eo/• LinkedIn : https://www.linkedin.com/in/aaltshuler/RAGNOR• GitHub: https://github.com/ragnorc• X: https://x.com/ragnorco------------------------------------TRANSCRIPTPaste into your favorite LLM- https://gist.github.com/stanislavkozlovski/70d5663466f19a247b336437ea2a968d- https://gist.githubusercontent.com/stanislavkozlovski/70d5663466f19a247b336437ea2a968d/raw/003ae11c2e70fba76ccaad182e17c9b9389b3db9/modern_relay.md ------------------------------------If you found anything useful from this episode, please consider supporting our growth (so we can continue delivering valuable content). You can do this by simply liking the video. It takes 2 seconds to do, and recording/producing this takes us 8hrs+

1h 31m
May 23

exe.dev shows agents need VMs, not containers (with David Crawshaw)

Introducing exe - a brand-new cloud provider. (literally 6 months old)In this interview I talk to David Crawshaw, CEO & co-founder of exe.dev. David has a ton of systems programming experience given he co-founded and was the CTO of Tailscale. Before that, he had impactful contributions in his 8-year stint at Google where he worked on Golang, petabyte-scale log processing and implemented TCP/IP in Fuchsia (an OS by Google)Now he's building exe - a novel, incredibly dev-friendly cloud that lets you spin up VMs instantly at no extra cost. Users purchase resources (CPU, memory, disk) and share them seamlessly along VMs. In this interview, we discuss how the game has changed with AI, what current clouds lack and why VMs (not docker) are the right abstraction for sandboxing, experimentation and toy projects.What surprised me most when I tried exe.dev was how polished it looked for its age. It went public less than 6 months ago.--------------------------------------------------------------------TIMELINE0:00 why build a new cloud?2:07 why Docker isn't enough for agents12:28 AI-friendly is developer-friendly20:32 why VMs are the right abstraction28:30 the exorbitant price of IOPS in the cloud32:21 Cloud Discounts33:40 rise of Self hosting41:25 Shelly and AI ops agents48:10 the hard problem with AI SREs53:00 Parting Thoughts (and early EC2’s noisy neighbor shenanigans)-----------------------------------------DAVIDYou can find David on:- LinkedIn: https://www.linkedin.com/in/crawshaw/- X: https://x.com/davidcrawshaw- GitHub: https://github.com/crawshaw--------------------------------------------------------------------EXE.DEVYou can try the product here for 7 days no credit card: http://exe.dev/I recommend it, the UX is great.Here's an example how I created a Kafka cluster: https://x.com/kozlovski/status/2050943790580457646/video/1 or https://www.linkedin.com/feed/update/urn:li:activity:7456699132351582208/--------------------------------------------------------------------TRANSCRIPTFeed this into your favorite AI for summarization, or to prompt it specific questions:https://gist.githubusercontent.com/stanislavkozlovski/265dcf06d9f51721e7fd2b38a39593ca/raw/642b9c528fac5d1f09446e1412827801fd98cbd5/foo.md--------------------------------------------------------------------OTHER PLATFORMSWatch on YouTube here:https://youtu.be/1GX18UGoJRwApple Podcasts:https://podcasts.apple.com/us/podcast/topicpartition/id1814926834General RSS:https://anchor.fm/s/104fd76e0/podcast/rss

57 min
Apr 9

postgres can be your data lake (with pg_lake)

This is an engineering conversation around pg_lake - a new OSS Postgres extension that lets you query and manage Iceberg tables directly form Postgres. Marco Slot, who has EXTENSIVE experience, shares with us various engineering internals, like:• how pg_lake makes analytics (literally) 100x faster• why Postgres is architecturally terrible at analytical queries (and how vectorized execution fixes this)• how (and why) pg_lake intercepts query plans and delegates parts of the query tree to DuckDB• Marco's hard-won experience through a decade+ career in Postgres• versatility as the real moat of Postgres• the practical differences in engineering b/w OLTP and OLAP• and a lot more--------------------------------------------------------------------*TIMELINE*0:02 What is pg_lake?2:23 Postgres' 100x slower problem and columnar storage experiments they had to make Postgres fast for analytics6:00 practical examples and internals16:20 perf internals - vectorized execution & CPU Optimization23:00 pg_lake architecture (why DuckDB isn't embedded) and the connection-per-process issue29:16 how pg_lake intercepts the query plan tree and delegates parts to DuckDB41:09 Iceberg catalogs48:24 postgres to iceberg ingestion patterns (and pg_incremental)53:40 Marco's (long) career: early AWS, Citus, Microsoft, Crunchy Data & Snowflake1:04:20 Marco's observations around the merging between OLTP and OLAP (and the subtle dev differences there)1:15:30 reverse ETL1:33:08 Iceberg as the TCP/IP for tables1:35:00 Marco's thoughts on the "Just Use Postgres" fever-----------------------------------------*MARCO*You can find Marco on:- LinkedIn: https://www.linkedin.com/in/marcoslot/- X: https://x.com/marcoslot- GitHub: https://github.com/marcoslot ----------------------------------------- *pg_lake* You can find the project on GitHub:- https://github.com/snowflake-labs/pg_lake -----------------------------------------*TRANSCRIPT*Feed this into your favorite AI for summarization, or to prompt it specific questions:https://gist.githubusercontent.com/stanislavkozlovski/65c037a8963e49d8121b25003ec94715/raw/4f51f5dcd562b42e8d511b8bc58f0fff6ad5302e/foo.md(or just send Gemini this video link and ask it)------------------------------------------*OTHER PLATFORMS*Watch on YouTube here:https://youtu.be/Jd0DcX2fO_kApple Podcasts:https://podcasts.apple.com/us/podcast/topicpartition/id1814926834General RSS:https://anchor.fm/s/104fd76e0/podcast/rss-----------------------------------------If you found anything useful from this episode, please consider supporting our growth (so we can continue delivering valuable content). You can do this by simply sending it to a friend. It takes 2 seconds to do, and recording/producing this takes us 8hrs+

1h 40m
Mar 30

why IBM bought Confluent

An insightful, casual conversation I had with a past Confluent colleague of mine - Justin Manchester.Justin was employee number 47 in Confluent, where he worked as many things (as one does in startups) but most lately a Technical Support Engineer diving into customers' problems using Kafka. In this episode, he shares his thoughts about the IBM acquisition through a lens that I haven't seen explored yet - Mainframes and Data Gravity (lock in). ---------------------------------------TIMELINE:0:00 - Mainframe Opening3:24 - Justin's Background10:49 - Data Gravity (why mainframe migration fails)14:37 - Joining Confluent as Employee #4718:58 - Kafka as a forward cache (RBC bank example)28:47 - Why is IBM interested in Confluent?45:09 - How IBM may monetize & pay back its acquisition (bundling)1:03:53 - AI Washing and the ZIRP Hangover in the tech space1:18:27 - Confluent's failed bets1:26:04 - What happens to Confluent post-IBM: Flink, WarpStream, etc1:34:23 - Why every data company is chasing the next 10x (to their detriment)---------------------------------------TRANSCRIPT:Feed this to our chatbot of choice to get an interactive summary:https://gist.github.com/stanislavkozlovski/3cd41999066e3316819885f20f7963ae---------------------------------------WHERE TO FIND JUSTIN:LinkedIn: https://www.linkedin.com/in/justin-manchester-48400858/

1h 40m
Jan 23

why Just Use Postgres /w Denis Magda

**JUST USE POSTGRES**Denis is the author of the brand new book "Just Use Postgres!". Consistent with the name, it talks about how most people do NOT need a collection of specialty databases nor systems. Instead, it focuses how you can fulfil the majority of these use cases with a plain old Postgres (and a bit of extensions).Over time, PostgreSQL has grown its network effect to become the most powerful general-purpose database in the world, namely due to its rich extension support and active community.In this super fun conversation, we talk through:• what MySQL got wrong w.r.t community and why Postgres succeeded• the recent (commercial) explosion of Postgres development in the world (Supabase, Neon, Crunchy, ClickHouse)• when Postgres instead of Kafka for messaging• the significance of the meme “just use postgres”• Postgres for data analytics-----------------------------------------**TIMELINE**00:00 - intro & Denis' history04:26 - The Just Use Postgres movement08:36 - The Small Data trend and its iImplications15:49 - The timely shift from distributed systems to simplicity17:15 - Updating your priors re: PG capabilities20:28 - MySQL's OSS community split & our experiences with such things39:48 - Postgres for analytics55:49 - PG in the lake house59:27 - today's AI marketing slop01:02:19 - How you'd build a brand new app on Postgres01:05:24 - Speedrun through PG extensions 01:11:56 - PG as a Message Queue-----------------------------------------**TRANSCRIPT**Feed this into your favorite AI for summarization, or to prompt it specific questions:https://gist.github.com/stanislavkozlovski/a8a38f9557486086bae2f3b81a1ad835-----------------------------------------**BUY THE BOOK**Listeners to this show have a special discount of ~40%!Enter code "2minstrMagda" at checkout to make use of it.Use it before it expires on March 1st 2026:✅ https://www.manning.com/books/just-use-postgres-----------------------------------------**DENIS**You can find Denis on:- LinkedIn: https://www.linkedin.com/in/dmagda/- X/Twitter: https://x.com/denismagda

1h 31m
Jan 9

tansu - pluggable stateless kafka (written in rust)

Tansu is an open-source (Apache-license), stateless Kafka broker that supports pluggable storage engines (PostgreSQL, SQLite, S3 and in-memory). Similar to WarpStream, it is a modern diskless Kafka implementation which does not explicitly store data on disk but rather outsources the durability and replication to another storage backend. Tansu is the only implementation (that I'm aware of) which supports pluggable storage backends, including SQL databases like Postgres and SQLite. Critically, unlike the other alternatives, it does not depend on a coordinator service. The combination of these two unlock a truly stateless Kafka which allows for some interesting use cases, like: instant auto-scalabilityincredibly flexible deployment models (per VPC, AZ, region, etc.)embedded Kafkadead-easy CI/CD integrationand more.It is also the only truly open source diskless Kafka implementation. Tansu is made for developers. It is fully Kafka API compatible, but makes certain ergonomic decisions that help it boast better UX than regular Kafka.

3h 30m
05/18/2025

The Ins and Outs of KIP-1150: Diskless Topics in Apache Kafka

An interview with Greg Harris and Ivan Yurchenko, part of the authors of the new KIP-1150 that introduces diskless topics to Apache Kafka. Diskless topics are a special kind of topic that write directly to an object store. This outsources the data replication and durability guarantees to the store (e.g., S3) and avoids the costly inter-zone data transfer fees that clouds charge you for. The result is a 80%+ reduced Kafka bill and a simpler architecture where brokers are stateless (or have less state) – making them significantly easier to maintain operationally. In this podcast, we dive deep into the technical details and tradeoffs in this complex KIP. ⏱️ TIMELINE 04:00 – Why Diskless? 09:17 – Architecture Walkthrough: How Does It Work 18:50 – The Batch Coordinator 22:00 – The Different Batch Coordinator Implementations 28:00 – The Inkless Fork (And Why) 39:40 – Topic-Based Batch Coordinator 45:40 – Bottlenecks in the Batch Coordinator? 47:48 – The Read Path 54:40 – Caching 01:03:00 – Shared Log Segment Merging (AKA Object Compaction) 01:19:06 – Latency 01:26:30 – Practical Real-World Latency Requirements 01:39:00 – The Power of Open Source 01:47:30 – Cost Breakdown 01:58:00 – Ops and Engineering Time Savings 02:06:30 – Client Bootstrapping, Leaderless Partitions, One Broker Connection per Client 02:11:20 – Multi-Tenant Kafka Future 02:14:05 – Iceberg, Parquet, and First-Class Schema Support 02:23:10 – S3 ExpressIf you found this episode interesting, please give it a like to signal to the algorithm that it's good. It takes 2 seconds to do, and it took us 5 hours to produce. 💬 HOW TO GET INVOLVEDWe recommend participating in the Apache Kafka mailing list. Beyond that, you can find Greg and Ivan here: Ivan Yurchenko – X: @ivan0yu Greg Harris – X: @gharris1727⚠️ DISCLAIMERSAll views expressed in this podcast are personal opinions of the participants. They do not represent the views or positions of Aiven, the employer of Ivan and Greg, or any other entity. Apache®, Apache Kafka®, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

2h 27m

7 Episodes

A in-depth engineering podcast about Apache Kafka

Creator

2minutestreaming
Years Active

2025 - 2026
Episodes

7
Rating

Clean
Copyright

© 2minutestreaming
Show Website

TopicPartition