13 afleveringen

Neural Search Talks — Zeta Alpha Zeta Alpha

- Technologie

A monthly podcast where we discuss recent research and developments in the world of Neural Search, LLMs, RAG and Natural Language Processing with our co-hosts Jakub Zavrel (AI veteran and founder at Zeta Alpha) and Dinos Papakostas (AI Researcher at Zeta Alpha).

- 19 APR. 2024
Baking the Future of Information Retrieval Models

Baking the Future of Information Retrieval Models

In this episode of Neural Search Talks, we're chatting with Aamir Shakir from Mixed Bread AI, who shares his insights on starting a company that aims to make search smarter with AI. He details their approach to overcoming challenges in embedding models, touching on the significance of data diversity, novel loss functions, and the future of multilingual and multimodal capabilities. We also get insights on their journey, the ups and downs, and what they're excited about for the future.

Timestamps:
0:00 Introduction
0:25 How did mixedbread.ai start?
2:16 The story behind the company name and its "bakers"
4:25 What makes Berlin a great pool for AI talent
6:12 Building as a GPU-poor team
7:05 The recipe behind mxbai-embed-large-v1
9:56 The Angle objective for embedding models
15:00 Going beyond Matryoshka with mxbai-embed-2d-large-v1
17:45 Supporting binary embeddings & quantization
19:07 Collecting large-scale data is key for robust embedding models
21:50 The importance of multilingual and multimodal models for IR
24:07 Where will mixedbread.ai be in 12 months?
26:46 Outro
- 27 min.
- 19 APR. 2024
Hacking JIT Assembly to Build Exascale AI Infrastructure

Hacking JIT Assembly to Build Exascale AI Infrastructure

Ash shares his journey from software development to pioneering in the AI infrastructure space with Unum. He discusses Unum's focus on unleashing the full potential of modern computers for AI, search, and database applications through efficient data processing and infrastructure. Highlighting Unum's technical achievements, including SIMD instructions and just-in-time compilation, Ash also touches on the future of computing and his vision for Unum to contribute to advances in personalized medicine and extending human productivity.

Timestamps:
0:00 Introduction
0:44 How did Unum start and what is it about?
6:12 Differentiating from the competition in vector search
17:45 Supporting modern features like large dimensions & binary embeddings
27:49 Upcoming model releases from Unum
30:00 The future of hardware for AI
34:56 The impact of AI in society
37:35 Outro
- 38 min.
- 19 APR. 2024
The Promise of Language Models for Search: Generative Information Retrieval

The Promise of Language Models for Search: Generative Information Retrieval

In this episode of Neural Search Talks, Andrew Yates (Assistant Prof at the University of Amsterdam) Sergi Castella (Analyst at Zeta Alpha), and Gabriel Bénédict (PhD student at the University of Amsterdam) discuss the prospect of using GPT-like models as a replacement for conventional search engines.

Generative Information Retrieval (Gen IR) SIGIR Workshop

Workshop organized by Gabriel Bénédict, Ruqing Zhang, and Donald Metzler https://coda.io/@sigir/gen-ir
Resources on Gen IR: https://github.com/gabriben/awesome-generative-information-retrieval

References

Rethinking Search: https://arxiv.org/abs/2105.02274
Survey on Augmented Language Models: https://arxiv.org/abs/2302.07842
Differentiable Search Index: https://arxiv.org/abs/2202.06991
Recommender Systems with Generative Retrieval: https://shashankrajput.github.io/Generative.pdf

Timestamps:
00:00 Introduction, ChatGPT Plugins
02:01 ChatGPT plugins, LangChain
04:37 What is even Information Retrieval?
06:14 Index-centric vs. model-centric Retrieval
12:22 Generative Information Retrieval (Gen IR)
21:34 Gen IR emerging applications
24:19 How Retrieval Augmented LMs incorporate external knowledge
29:19 What is hallucination?
35:04 Factuality and Faithfulness
41:04 Evaluating generation of Language Models
47:44 Do we even need to "measure" performance?
54:07 How would you evaluate Bing's Sydney?
57:22 Will language models take over commercial search?
1:01:44 NLP academic research in the times of GPT-4
1:06:59 Outro
- 1 u. 7 min.
- 27 JAN. 2023
Task-aware Retrieval with Instructions

Task-aware Retrieval with Instructions

Andrew Yates (Assistant Prof at University of Amsterdam) and Sergi Castella (Analyst at Zeta Alpha) discuss the paper "Task-aware Retrieval with Instructions" by Akari Asai et al. This paper proposes to augment a conglomerate of existing retrieval and NLP datasets with natural language instructions (BERRI, Bank of Explicit RetRieval Instructions) and use it to train TART (Multi-task Instructed Retriever).

📄 Paper: https://arxiv.org/abs/2211.09260

🍻 BEIR benchmark: https://arxiv.org/abs/2104.08663

📈 LOTTE (Long-Tail Topic-stratified Evaluation, introduced in ColBERT v2): https://arxiv.org/abs/2112.01488

Timestamps:

00:00 Intro: "Task-aware Retrieval with Instructions"

02:20 BERRI, TART, X^2 evaluation

04:00 Background: recent works in domain adaptation

06:50 Instruction Tuning 08:50 Retrieval with descriptions

11:30 Retrieval with instructions

17:28 BERRI, Bank of Explicit RetRieval Instructions

21:48 Repurposing NLP tasks as retrieval tasks

23:53 Negative document selection

27:47 TART, Multi-task Instructed Retriever

31:50 Evaluation: Zero-shot and X^2 evaluation

39:20 Results on Table 3 (BEIR, LOTTE)

50:30 Results on Table 4 (X^2-Retrieval)

55:50 Ablations

57:17 Discussion: user modeling, future work, scale
- 1 u. 11 min.
- 13 DEC. 2022
Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee

Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee

Marzieh Fadaee — NLP Research Lead at Zeta Alpha — joins Andrew Yates and Sergi Castella to chat about her work in using large Language Models like GPT-3 to generate domain-specific training data for retrieval models with little-to-no human input. The two papers discussed are "InPars: Data Augmentation for Information Retrieval using Large Language Models" and "Promptagator: Few-shot Dense Retrieval From 8 Examples".

InPars: https://arxiv.org/abs/2202.05144

Promptagator: https://arxiv.org/abs/2209.11755

Timestamps:

00:00 Introduction

02:00 Background and journey of Marzieh Fadaee

03:10 Challenges of leveraging Large LMs in Information Retrieval

05:20 InPars, motivation and method

14:30 Vanilla vs GBQ prompting

24:40 Evaluation and Benchmark

26:30 Baselines

27:40 Main results and takeaways (Table 1, InPars)

35:40 Ablations: prompting, in-domain vs. MSMARCO input documents

40:40 Promptagator overview and main differences with InPars

48:40 Retriever training and filtering in Promptagator

54:37 Main Results (Table 2, Promptagator)

1:02:30 Ablations on consistency filtering (Figure 2, Promptagator)

1:07:39 Is this the magic black-box pipeline for neural retrieval on any documents

1:11:14 Limitations of using LMs for synthetic data

1:13:00 Future directions for this line of research
- 1 u. 16 min.
- 16 AUG. 2022
ColBERT + ColBERTv2: late interaction at a reasonable inference cost

ColBERT + ColBERTv2: late interaction at a reasonable inference cost

Andrew Yates (Assistant Professor at the University of Amsterdam) and Sergi Castella (Analyst at Zeta Alpha) discus the two influential papers introducing ColBERT (from 2020) and ColBERT v2 (from 2022), which mainly propose a fast late interaction operation to achieve a performance close to full cross-encoders but at a more manageable computational cost at inference; along with many other optimizations.

📄 ColBERT: "ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT" by Omar Khattab and Matei Zaharia. https://arxiv.org/abs/2004.12832

📄 ColBERTv2: "ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction" by Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. https://arxiv.org/abs/2112.01488

📄 PLAID: "An Efficient Engine for Late Interaction Retrieval" by Keshav Santhanam, Omar Khattab, Christopher Potts, and Matei Zaharia. https://arxiv.org/abs/2205.09707

📄 CEDR: "CEDR: Contextualized Embeddings for Document Ranking" by Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian. https://arxiv.org/abs/1904.07094

🪃 Feedback form: https://scastella.typeform.com/to/rg7a5GfJ

Timestamps:

00:00 Introduction

00:42 Why ColBERT?

03:34 Retrieval paradigms recap

08:04 ColBERT query formulation and architecture

09:04 Using ColBERT as a reranker or as an end-to-end retriever

11:28 Space Footprint vs. MRR on MS MARCO

12:24 Methodology: datasets and negative sampling

14:37 Terminology for cross encoders, interaction-based models, etc.

16:12 Results (ColBERT v1) on MS MARCO

18:41 Ablations on model components

20:34 Max pooling vs. mean pooling

22:54 Why did ColBERT have a big impact?

26:31 ColBERTv2: knowledge distillation

29:34 ColBERTv2: indexing improvements

33:59 Effects of clustering compression in performance

35:19 Results (ColBERT v2): MS MARCO

38:54 Results (ColBERT v2): BEIR

41:27 Takeaway: strong specially in out-of-domain evaluation

43:59 Qualitatively how do ColBERT scores look like?

46:21 What's the most promising of all current neural IR paradigms

49:34 How come there's still so much interest in Dense retrieval?

51:09 Many to many similarity at different granularities

53:44 What would ColBERT v3 include?

56:39 PLAID: An Efficient Engine for Late Interaction Retrieval

Contact: castella@zeta-alpha.com
- 57 min.