98 episodes

Weaviate Podcast Weaviate

- Technology
- 5.0 • 1 Rating

Join Connor Shorten as he interviews Weaviate community users, leading machine learning experts, and explores Weaviate use cases from users and customers.

- JUN 11, 2024
The Future of Search with Nils Reimers and Erika Cardenas - Weaviate Podcast #97!

The Future of Search with Nils Reimers and Erika Cardenas - Weaviate Podcast #97!

Hey everyone! I am SUPER excited to publish our 97th Weaviate Podcast on the state of AI-powered Search technology featuring Nils Reimers and Erika Cardenas! Erika and I have been super excited about Cohere's latest works to advance RAG and Search and it was amazing getting to pick Nils' brain about all these topics!

We began with the development of Compass! Nils explains the current problem with embeddings as a soup!! For example, imagine embedding this video description, the first part is about the launch of a podcast, whereas this part is about an embedding algorithm -- how do we form representations of multi-aspect chunks of text?

We dove into all the details of this from the distinction of multi-aspect embeddings with LLM or "smart" chunkers, ColBERT, "Embed Small, Retrieve Big", and many other topics as well from Cross Encoder Re-rankers to Data Cleaning with Generative Feedback Loops, RAG Evaluation, Vector Quantization, and more!

I really hope you enjoy the podcast! It was such an educational experience for Erika and I and we really hope you enjoy it as well!
- 59 min
- JUN 5, 2024
Deep Learning with Letitia Parcalabescu - Weaviate Podcast #96!

Deep Learning with Letitia Parcalabescu - Weaviate Podcast #96!

Hey everyone! Thank you so much for watching the 96th episode of the Weaviate podcast featuring Letitia Parcalabescu! While completing her Ph.D. studies at the University of Heidelberg, Letitia started her YouTube channel: AI Coffee Break with Letitia! Her videos break down complex concepts in AI with a creative mix of technical expertise and visualizations unlike anyone else in the space!We began the podcast by discussing our shared background in creating content on YouTube from starting, to plans for the future, and everything else in between!We then discussed the evolution of Deep Learning over the last few years -- from neural network architectures to datasets, tasks, learning algorithms, and more! I think particularly we are at a really interesting time in the future of learning algorithms! We discussed DSPy and new ways of thinking about instruction tuning, example production, gradient descent, and the future of SFT vs. DPO-style techniques!
- 1 hr 35 min
- MAY 7, 2024
Google Cloud Marketplace with Dai Vu and Bob van Luijt - Weaviate Podcast #95!

Google Cloud Marketplace with Dai Vu and Bob van Luijt - Weaviate Podcast #95!

Hey everyone, thank you so much for watching the 95th Weaviate Podcast! We are beyond honored to feature Dai Vu from Google on this one, alongside Weaviate Co-Founder Bob van Luijt! This podcast dives into all things Google Cloud Marketplace and the state of AI. Beginning with the proliferation of Open-Source models and how Dai sees the evolving landscape with respect to things like Gemini Pro 1.5, Gemini Nano and Gemma, as well as the integration of 3rd party model providers such as Llama 3 on Google Cloud platforms such as Vertex AI. Bob and Dai continue to unpack the next move for open-source infrastructure providers and perspectives around "AI-Native" applications, trends in data gravity, perspectives on benchmarking, and Dai's "aha" moment in AI!
- 41 min
- APR 24, 2024
ParlayANN with Magdalen Dobson Manohar

ParlayANN with Magdalen Dobson Manohar

As you are graduating from ideas to engineering, one of the key concepts to be aware of is Parallel Computing and Concurrency. I am SUPER excited to share our 94th Weaviate podcast with Magdalen Dobson Manohar! Magdalen is one of the most impressive scientists I have ever met, having completed her undergraduate studies at MIT before joining Carnegie Mellon University to study Approximate Nearest Neighbor Search and develop ParlayANN. ParlayANN is one of the most enlightening works I have come across that studies how to build ANN indexes in parallel without the use of locking.

In my opinion, this is the most insightful podcast we have ever produced into Vector Search, the core technology behind Vector Databases. The podcast begins with Magdalen’s journey into ANN science, the issue of Lock Contention in HNSW, further detailing HNSW vs. DiskANN vs. HCNNG and pyNNDescent, ParlayIVF, how Parallel Index Construction is achieved, conclusions from experimentation, Filtered Vector Search, Out of Distribution Vector Search, and exciting directions for the future!

I also want to give a huge thanks to Etienne Dilocker, John Trengrove, Abdel Rodriguez, Asdine El Hrychy, and Zain Hasan. There is no way I would be able to keep up with conversations like this without their leadership and collaboration.

I hope you find the podcast interesting and useful!
- 1 hr 3 min
- APR 15, 2024
RAGKit with Kyle Davis - Weaviate Podcast #93!

RAGKit with Kyle Davis - Weaviate Podcast #93!

Hey everyone! I am SUPER excited to publish our newest Weaviate podcast with Kyle Davis, the creator of RAGKit! At a high-level, the podcast covers our understanding of RAG systems through 4 key areas: (1) Ingest / ETL, (2) Search, (3) Generate / Agents, and (4) Evaluation. Discussing these lead to all sorts of topics from Knowledge Graph RAG, to Function Calling and Tool Selection, Re-ranking, Quantization, and many more!

This discussion forced me to re-think many of my previously held beliefs about the current RAG stack, particularly the definition of “Agents”. I came in believing that the best way of viewing “Agents” is an abstraction on top of multiple pipelines, such as an “Email Agent”, but Kyle presented the idea of looking at “Agents” as scoping the tools each LLM call is connected to, such as `read_email` or `calculator`. Would love to know what people think about this one, as I think getting a consensus definition of “Agents” can clarify a lot of the current confusion for people building with LLMs / Generative AI.
- 1 hr 27 min
- MAR 28, 2024
VetRec with David de Matheu - Weaviate Podcast #92!

VetRec with David de Matheu - Weaviate Podcast #92!

I've seen a lot of interest around RAG for X application domain, Legal, Accounting, Healthcare, .... David and Kevin are maybe the best example of this I have seen so far, pivoting from Neum AI to VetRec!

We begin the podcast by discussing the decision to switch gears, the advice given by Y Combinator, and David's experience in learning a new application domain.

We then continue to discuss technical opportunities around RAG for Veterinarians, such as SOAP notes and Differential Diagnosis!

We conclude with David's thoughts on the ETL space, companies like Unstructured and LlamaIndex's LlamaParse, advice for specific focus in ETL, and general discussions of ETL for Vector DBs / KGs / SQL.

David and Kevin have been two of my favorite entrepreneurs I've met during my time at Weaviate! They do an amazing job of writing content that helps you live vicariously through them as they take on this opportunity to apply RAG and AI technologies to help Veterinarians!

I really hope you enjoy the podcast!
- 59 min