Vector Podcast

Dmitry Kan

Vector Podcast is here to bring you the depth and breadth of Search Engine Technology, Product, Marketing, Business. In the podcast we talk with engineers, entrepreneurs, thinkers and tinkerers, who put their soul into search. Depending on your interest, you should find a matching topic for you -- whether it is deep algorithmic aspect of search engines and information retrieval field, or examples of products offering deep tech to its users. "Vector" -- because it aims to cover an emerging field of vector similarity search, giving you the ability to search content beyond text: audio, video, images and more. "Vector" also because it is all about vector in your profession, product, marketing and business. Podcast website: https://www.vectorpodcast.com/ Dmitry is blogging on https://dmitry-kan.medium.com/

  1. Trey Grainger - Wormhole Vectors

    11/07/2025

    Trey Grainger - Wormhole Vectors

    This lightning session introduces a new idea in vector search - Wormhole vectors! It has deep roots in physics and allows for transcending spaces of any nature: sparse, vector and behaviour (but could theoretically be any N-dimensional space). Blog post on Medium: https://dmitry-kan.medium.com/novel-idea-in-vector-search-wormhole-vectors-6093910593b8 Session page on maven: https://maven.com/p/8c7de9/beyond-hybrid-search-with-wormhole-vectors?utm_campaign=NzI2NzIx&utm_medium=ll_share_link&utm_source=instructor To try the managed OpenSearch (multi-cloud, automatic backups, disaster recovery, vector search and more), go here: https://console.aiven.io/signup?utm_source=youtube&utm_medium&&utm_content=vectorpodcast Get credits to use Aiven's products (PG, Kafka, Valkey, OpenSearch, ClickHouse): https://aiven.io/startups Timecodes: 00:00 Intro by Dmitry 01:48 Trey's presentation 03:05 Walk to the AI-Powered Search course by Trey and Doug 07:07 Intro to vector spaces and embeddings 19:03 Disjoint vector spaces and the need of hybrid search 23:11 Different modes of search 24:49 Wormhole vectors 47:49 Q&A What you'll learn: - What are "Wormhole Vectors"? Learn how wormhole vectors work & how to use them to traverse between disparate vector spaces for better hybrid search. - Building a behavioral vector space from click stream data Learn to generate behavioral embeddings to be integrated with dense/semantic and sparse/lexical vector queries. - Traverse lexical, semantic, & behavioral vectors spaces Jump back and forth between multiple dense and sparse vector spaces in the same query - Advanced hybrid search techniques (beyond fusion algorithms) Hybrid search is more than mixing lexical + semantic search. See advanced techniques and where wormhole vectors fit in. YouTube: https://www.youtube.com/watch?v=fvDC7nK-_C0

    1h 19m
  2. Economical way of serving vector search workloads with Simon Eskildsen, CEO Turbopuffer

    09/19/2025

    Economical way of serving vector search workloads with Simon Eskildsen, CEO Turbopuffer

    Turbopuffer search engine supports such products as Cursor, Notion, Linear, Superhuman and Readwise. This episode on YouTube: https://youtu.be/I8Ztqajighg If you are on Lucene / OpenSearch stack, you can go managed by signing up here: https://console.aiven.io/signup?utm_source=youtube&utm_medium=&&utm_content=vectorpodcast Time codes: 00:00 Intro 00:15 Napkin Problem 4: Throughput of Redis 01:35 Episode intro 02:45 Simon's background, including implementation of Turbopuffer 09:23 How Cursor became an early client 11:25 How to test pre-launch 14:38 Why a new vector DB deserves to exist? 20:39 Latency aspect 26:27 Implementation language for Turbopuffer 28:11 Impact of LLM coding tools on programmer craft 30:02 Engineer 2 CEO transition 35:10 Architecture of Turbopuffer 43:25 Disk vs S3 latency, NVMe disks, DRAM 48:27 Multitenancy 50:29 Recall@N benchmarking 59:38 filtered ANN and Big-ANN Benchmarks 1:00:54 What users care about more (than Recall@N benchmarking) 1:01:28 Spicy question about benchmarking in competition 1:06:01 Interesting challenges ahead to tackle 1:10:13 Simon's announcement Show notes: - Turbopuffer in Cursor: https://www.youtube.com/watch?v=oFfVt3S51T4&t=5223s transcript: https://lexfridman.com/cursor-team-transcript - https://turbopuffer.com/ - Napkin Math: https://sirupsen.com/napkin - Follow Simon on X: https://x.com/Sirupsen - Not All Vector Databases Are Made Equal: https://towardsdatascience.com/milvus-pinecone-vespa-weaviate-vald-gsi-what-unites-these-buzz-words-and-what-makes-each-9c65a3bd0696/

    1h 15m
  3. Adding ML layer to Search: Hybrid Search Optimizer with Daniel Wrigley and Eric Pugh

    03/21/2025

    Adding ML layer to Search: Hybrid Search Optimizer with Daniel Wrigley and Eric Pugh

    Vector Podcast website: https://vectorpodcast.com Haystack US 2025: https://haystackconf.com/2025/ Federated search, Keyword & Neural Search, ML Optimisation, Pros and Cons of Hybrid search It is fascinating and funny how things develop, but also turn around. In 2022-23 everyone was buzzing about hybrid search. In 2024 the conversation shifted to RAG, RAG, RAG. And now we are in 2025 and back to hybrid search - on a different level: finally there are strides and contributions towards making hybrid search parameters learnt with ML. How cool is that? Design: Saurabh Rai, https://www.linkedin.com/in/srbhr/ The design of this episode is inspired by a scene in Blade Runner 2049. There's a clear path leading towards where people want to go to, yet they're searching for something. 00:00 Intro 00:54 Eric's intro and Daniel's background 02:50 Importance of Hybrid search: Daniel's take 07:26 Eric's take 10:57 Dmitry's take 11:41 Eric's predictions 13:47 Doug's blog on RRF is not enough 16:18 How to not fall short of the blind picking in RRF: score normalization, combinations and weights 25:03 The role of query understanding: feature groups 35:11 Lesson 1 from Daniel: Simple models might be all you need 36:30 Lesson 2: query features might be all you need 38:30 Reasoning capabilities in search 40:02 Question from Eric: how is this different from Learning To Rank? 42:46 Carrying the past in Learning To Rank / any rank 44:21 Demo! 51:52 How to consume this in OpenSearch 55:15 What's next 58:44 Haystack US 2025 YouTube: https://www.youtube.com/watch?v=quY769om1EY

    1h 3m
  4. Code search, Copilot, LLM prompting with empathy and Artifacts with John Berryman

    02/10/2025

    Code search, Copilot, LLM prompting with empathy and Artifacts with John Berryman

    Vector Podcast website: https://vectorpodcast.com Get your copy of John's new book "Prompt Engineering for LLMs: The Art and Science of Building Large Language Model–Based Applications": https://amzn.to/4fMj2Ef John Berryman is the founder and principal consultant of Arcturus Labs, where he specializes in AI application development (Agency and RAG). As an early engineer on GitHub Copilot, John contributed to the development of its completions and chat functionalities, working at the forefront of AI-assisted coding tools. John is coauthor of "Prompt Engineering for LLMs" (O'Reilly).Before his work on Copilot, John's focus was search technology. His diverse experience includes helping to develop next-generation search system for the US Patent Office, building search and recommendations for Eventbrite, and contributing to GitHub's code search infrastructure. John is also coauthor of "Relevant Search" (Manning), a book that distills his expertise in the field.John's unique background, spanning both cutting-edge AI applications and foundational search technologies, positions him at the forefront of innovation in LLM applications and information retrieval. 00:00 Intro 02:19 John's background and story in search and ML 06:03 Is RAG just a prompt engineering technique? 10:15 John's progression from a search engineer to ML researcher 13:40 LLM predictability vs more traditional programming 22:31 Code assist with GitHub Copilot 29:44 Role of keyword search for code at GitHub 35:01 GenAI: existential risk or pure magic? AI Natives 39:40 What are Artifacts 46:59 Demo! 55:13 Typed artifacts, tools, accordion artifacts 56:21 From Web 2.0 to Idea exchange 57:51 Spam will transform into Slop 58:56 John's new book and Acturus Labs intro Show notes: - John Berryman on X: https://x.com/JnBrymn - Acturus Labs: https://arcturus-labs.com/ - John's blog on Artifacts (see demo in the episode): https://arcturus-labs.com/blog/2024/11/11/cut-the-chit-chat-with-artifacts/ YouTube: https://youtu.be/60HAtHVBYj8

    1h 7m

Ratings & Reviews

5
out of 5
2 Ratings

About

Vector Podcast is here to bring you the depth and breadth of Search Engine Technology, Product, Marketing, Business. In the podcast we talk with engineers, entrepreneurs, thinkers and tinkerers, who put their soul into search. Depending on your interest, you should find a matching topic for you -- whether it is deep algorithmic aspect of search engines and information retrieval field, or examples of products offering deep tech to its users. "Vector" -- because it aims to cover an emerging field of vector similarity search, giving you the ability to search content beyond text: audio, video, images and more. "Vector" also because it is all about vector in your profession, product, marketing and business. Podcast website: https://www.vectorpodcast.com/ Dmitry is blogging on https://dmitry-kan.medium.com/