13 afleveringen

A monthly podcast where we discuss recent research and developments in the world of Neural Search, LLMs, RAG and Natural Language Processing with our co-hosts Jakub Zavrel (AI veteran and founder at Zeta Alpha) and Dinos Papakostas (AI Researcher at Zeta Alpha).

Neural Search Talks — Zeta Alpha Zeta Alpha

    • Technologie

A monthly podcast where we discuss recent research and developments in the world of Neural Search, LLMs, RAG and Natural Language Processing with our co-hosts Jakub Zavrel (AI veteran and founder at Zeta Alpha) and Dinos Papakostas (AI Researcher at Zeta Alpha).

    Baking the Future of Information Retrieval Models

    Baking the Future of Information Retrieval Models

    In this episode of Neural Search Talks, we're chatting with Aamir Shakir from Mixed Bread AI, who shares his insights on starting a company that aims to make search smarter with AI. He details their approach to overcoming challenges in embedding models, touching on the significance of data diversity, novel loss functions, and the future of multilingual and multimodal capabilities. We also get insights on their journey, the ups and downs, and what they're excited about for the future.



    Timestamps:
    0:00 Introduction
    0:25 How did mixedbread.ai start?
    2:16 The story behind the company name and its "bakers"
    4:25 What makes Berlin a great pool for AI talent
    6:12 Building as a GPU-poor team
    7:05 The recipe behind mxbai-embed-large-v1
    9:56 The Angle objective for embedding models
    15:00 Going beyond Matryoshka with mxbai-embed-2d-large-v1
    17:45 Supporting binary embeddings & quantization
    19:07 Collecting large-scale data is key for robust embedding models
    21:50 The importance of multilingual and multimodal models for IR
    24:07 Where will mixedbread.ai be in 12 months?
    26:46 Outro

    • 27 min.
    Hacking JIT Assembly to Build Exascale AI Infrastructure

    Hacking JIT Assembly to Build Exascale AI Infrastructure

    Ash shares his journey from software development to pioneering in the AI infrastructure space with Unum. He discusses Unum's focus on unleashing the full potential of modern computers for AI, search, and database applications through efficient data processing and infrastructure. Highlighting Unum's technical achievements, including SIMD instructions and just-in-time compilation, Ash also touches on the future of computing and his vision for Unum to contribute to advances in personalized medicine and extending human productivity.



    Timestamps:
    0:00 Introduction
    0:44 How did Unum start and what is it about?
    6:12 Differentiating from the competition in vector search
    17:45 Supporting modern features like large dimensions & binary embeddings
    27:49 Upcoming model releases from Unum
    30:00 The future of hardware for AI
    34:56 The impact of AI in society
    37:35 Outro

    • 38 min.
    The Promise of Language Models for Search: Generative Information Retrieval

    The Promise of Language Models for Search: Generative Information Retrieval

    In this episode of Neural Search Talks, Andrew Yates (Assistant Prof at the University of Amsterdam) Sergi Castella (Analyst at Zeta Alpha), and Gabriel Bénédict (PhD student at the University of Amsterdam) discuss the prospect of using GPT-like models as a replacement for conventional search engines.

    Generative Information Retrieval (Gen IR) SIGIR Workshop


    Workshop organized by Gabriel Bénédict, Ruqing Zhang, and Donald Metzler https://coda.io/@sigir/gen-ir
    Resources on Gen IR: https://github.com/gabriben/awesome-generative-information-retrieval


    References


    Rethinking Search: https://arxiv.org/abs/2105.02274
    Survey on Augmented Language Models: https://arxiv.org/abs/2302.07842
    Differentiable Search Index: https://arxiv.org/abs/2202.06991
    Recommender Systems with Generative Retrieval: https://shashankrajput.github.io/Generative.pdf



    Timestamps:
    00:00 Introduction, ChatGPT Plugins
    02:01 ChatGPT plugins, LangChain
    04:37 What is even Information Retrieval?
    06:14 Index-centric vs. model-centric Retrieval
    12:22 Generative Information Retrieval (Gen IR)
    21:34 Gen IR emerging applications
    24:19 How Retrieval Augmented LMs incorporate external knowledge
    29:19 What is hallucination?
    35:04 Factuality and Faithfulness
    41:04 Evaluating generation of Language Models
    47:44 Do we even need to "measure" performance?
    54:07 How would you evaluate Bing's Sydney?
    57:22 Will language models take over commercial search?
    1:01:44 NLP academic research in the times of GPT-4
    1:06:59 Outro

    • 1 u. 7 min.
    Task-aware Retrieval with Instructions

    Task-aware Retrieval with Instructions

    Andrew Yates (Assistant Prof at University of Amsterdam) and Sergi Castella (Analyst at Zeta Alpha) discuss the paper "Task-aware Retrieval with Instructions" by Akari Asai et al. This paper proposes to augment a conglomerate of existing retrieval and NLP datasets with natural language instructions (BERRI, Bank of Explicit RetRieval Instructions) and use it to train TART (Multi-task Instructed Retriever).  

    📄 Paper: https://arxiv.org/abs/2211.09260

    🍻 BEIR benchmark: https://arxiv.org/abs/2104.08663

    📈 LOTTE (Long-Tail Topic-stratified Evaluation, introduced in ColBERT v2): https://arxiv.org/abs/2112.01488

    Timestamps: 

    00:00 Intro: "Task-aware Retrieval with Instructions"

    02:20 BERRI, TART, X^2 evaluation

    04:00 Background: recent works in domain adaptation

    06:50 Instruction Tuning 08:50 Retrieval with descriptions

    11:30 Retrieval with instructions

    17:28 BERRI, Bank of Explicit RetRieval Instructions

    21:48 Repurposing NLP tasks as retrieval tasks

    23:53 Negative document selection

    27:47 TART, Multi-task Instructed Retriever

    31:50 Evaluation: Zero-shot and X^2 evaluation

    39:20 Results on Table 3 (BEIR, LOTTE)

    50:30 Results on Table 4 (X^2-Retrieval)

    55:50 Ablations

    57:17 Discussion: user modeling, future work, scale

    • 1 u. 11 min.
    Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee

    Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee

    Marzieh Fadaee — NLP Research Lead at Zeta Alpha — joins Andrew Yates and Sergi Castella to chat about her work in using large Language Models like GPT-3 to generate domain-specific training data for retrieval models with little-to-no human input. The two papers discussed are "InPars: Data Augmentation for Information Retrieval using Large Language Models" and "Promptagator: Few-shot Dense Retrieval From 8 Examples".

    InPars: https://arxiv.org/abs/2202.05144

    Promptagator: https://arxiv.org/abs/2209.11755



    Timestamps:

    00:00 Introduction

    02:00 Background and journey of Marzieh Fadaee

    03:10 Challenges of leveraging Large LMs in Information Retrieval

    05:20 InPars, motivation and method

    14:30 Vanilla vs GBQ prompting

    24:40 Evaluation and Benchmark

    26:30 Baselines

    27:40 Main results and takeaways (Table 1, InPars)

    35:40 Ablations: prompting, in-domain vs. MSMARCO input documents

    40:40 Promptagator overview and main differences with InPars

    48:40 Retriever training and filtering in Promptagator

    54:37 Main Results (Table 2, Promptagator)

    1:02:30 Ablations on consistency filtering (Figure 2, Promptagator)

    1:07:39 Is this the magic black-box pipeline for neural retrieval on any documents

    1:11:14 Limitations of using LMs for synthetic data

    1:13:00 Future directions for this line of research

    • 1 u. 16 min.
    ColBERT + ColBERTv2: late interaction at a reasonable inference cost

    ColBERT + ColBERTv2: late interaction at a reasonable inference cost

    Andrew Yates (Assistant Professor at the University of Amsterdam) and Sergi Castella (Analyst at Zeta Alpha) discus the two influential papers introducing ColBERT (from 2020) and ColBERT v2 (from 2022), which mainly propose a fast late interaction operation to achieve a performance close to full cross-encoders but at a more manageable computational cost at inference; along with many other optimizations.



    📄 ColBERT: "ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT" by Omar Khattab and Matei Zaharia. https://arxiv.org/abs/2004.12832

    📄 ColBERTv2: "ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction" by Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. https://arxiv.org/abs/2112.01488

    📄 PLAID: "An Efficient Engine for Late Interaction Retrieval" by Keshav Santhanam, Omar Khattab, Christopher Potts, and Matei Zaharia. https://arxiv.org/abs/2205.09707

    📄 CEDR: "CEDR: Contextualized Embeddings for Document Ranking" by Sean MacAvaney, Andrew Yates, Arman Cohan, and Nazli Goharian. https://arxiv.org/abs/1904.07094



    🪃 Feedback form: https://scastella.typeform.com/to/rg7a5GfJ



    Timestamps:

    00:00 Introduction

    00:42 Why ColBERT?

    03:34 Retrieval paradigms recap

    08:04 ColBERT query formulation and architecture

    09:04 Using ColBERT as a reranker or as an end-to-end retriever

    11:28 Space Footprint vs. MRR on MS MARCO

    12:24 Methodology: datasets and negative sampling

    14:37 Terminology for cross encoders, interaction-based models, etc.

    16:12 Results (ColBERT v1) on MS MARCO

    18:41 Ablations on model components

    20:34 Max pooling vs. mean pooling

    22:54 Why did ColBERT have a big impact?

    26:31 ColBERTv2: knowledge distillation

    29:34 ColBERTv2: indexing improvements

    33:59 Effects of clustering compression in performance

    35:19 Results (ColBERT v2): MS MARCO

    38:54 Results (ColBERT v2): BEIR

    41:27 Takeaway: strong specially in out-of-domain evaluation

    43:59 Qualitatively how do ColBERT scores look like?

    46:21 What's the most promising of all current neural IR paradigms

    49:34 How come there's still so much interest in Dense retrieval?

    51:09 Many to many similarity at different granularities

    53:44 What would ColBERT v3 include?

    56:39 PLAID: An Efficient Engine for Late Interaction Retrieval



    Contact: castella@zeta-alpha.com

    • 57 min.

Top-podcasts in Technologie

De Technoloog | BNR
BNR Nieuwsradio
✨Poki - Podcast over Kunstmatige Intelligentie AI
Alexander Klöpping & Wietse Hage
Bright Podcast
Bright B.V.
Lex Fridman Podcast
Lex Fridman
Cryptocast | BNR
BNR Nieuwsradio
Tweakers Podcast
Tweakers

Suggesties voor jou

Super Data Science: ML & AI Podcast with Jon Krohn
Jon Krohn
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Sam Charrington
Practical AI: Machine Learning, Data Science
Changelog Media
No Priors: Artificial Intelligence | Technology | Startups
Conviction | Pod People
More or Less: Behind the Stats
BBC Radio 4
Planet Money
NPR