32 min

How do you evaluate an LLM? Try an LLM‪.‬ The Stack Overflow Podcast

    • Technology

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.

32 min

Top Podcasts In Technology

Wolof Tech
Maison du podcast
Apple Events (video)
Apple
Underscore_
Micode
The Instagram Stories - Social Media News
The Instagram Stories, Daniel Hill
Intelligence Artificielle - Data Driven 101 - Le podcast IA & Data 100% en français
Marc Sanselme - Scopeo - Agence d'Intelligence Artificielle
The AI Podcast
NVIDIA