Women in AI Research (WiAIR)

WiAIR

Women in AI Research (WiAIR) is a podcast dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our mission is to challenge the prevailing perception that AI research is predominantly male-driven. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this rapidly growing field. You will learn from women at different career stages, stay updated on the latest research and advancements, and hear powerful stories of overcoming obstacles and breaking stereotypes.

  1. Apr 13 ·  Bonus

    EACL 2026: LLMs Can Hear… But Can They Reason? A New Benchmark for Audio Intelligence

    What does it actually mean for a model to understand audio Paper: https://arxiv.org/abs/2601.19673 In this episode, I talk with Iwona Christop, a PhD student at Adam Mickiewicz University, about her recent EACL paper introducing ART (Audio Reasoning Tasks) — a new benchmark designed to evaluate whether multimodal LLMs can truly reason over audio, not just transcribe or classify it. Most existing benchmarks test audio skills in isolation (like ASR or classification). But real-world intelligence requires something deeper: combining signals, comparing sounds, tracking context, and making decisions. This work takes a different approach: No text-only shortcuts — tasks can’t be solved via transcription aloneReasoning-first design — models must combine multiple audio cuesNo expert knowledge required — anyone can verify correctness We also dive into the diverse task design, including: Audio arithmetic (counting and comparing sounds)Cross-recording speaker & language identificationSound-based reasoning (e.g., inferring properties from audio)Speech feature comparison (accents, variations)Multimodal reasoning across text and sound The dataset includes 9 tasks, 9,000 samples, and 30+ hours of audio — all generated in a scalable way using templates and TTS. 👉 If you care about multimodal reasoning, evaluation, or the limits of current LLM capabilities, this conversation is for you. Iwona Christop: https://www.linkedin.com/in/iwona-christop/ 👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon #WiAIR #EACL2026

    18 min
  2. Apr 10 ·  Bonus

    EACL 2026: Reasoning Can Hurt LLM Safety?! Rethinking Accuracy in AI Systems

    In this episode of #WiAIRpodcast, we dive into a subtle but critical question: Does adding reasoning actually make LLMs safer and more reliable? Paper: https://arxiv.org/abs/2510.21049 Atoosa Chegini (University of Maryland, Apple) presents Reasoning's Razor (EACL 2026), where she and her collaborators examine how reasoning impacts high-stakes binary classification tasks, including safety filtering and hallucination detection. Their findings highlight an important nuance: While reasoning can improve overall accuracy, it may degrade performance at low false positive rates -- exactly where real-world systems need to operate.This conversation covers: Why accuracy is a misleading metric for safety-critical LLM applicationsThe importance of evaluating models at fixed false positive rates (FPR)How two models with identical accuracy can behave completely differently in deploymentThe impact of "think-on" (with reasoning) vs "think-off" (no reasoning) settingsPractical implications for RLHF, SFT, and post-training pipelinesIf you're working on: LLM evaluation & reliabilityAI safety or hallucination detectionProduction deployment of language models— this discussion offers a perspective that is both technically grounded and immediately actionable. Atoosa: https://www.linkedin.com/in/atoosa-chegini-6713741a3/https://scholar.google.com/citations?user=5nY9tagAAAAJ&hl=en&oi=ao👍 Like & subscribe for more deep dives into cutting-edge AI research 🔔 New episodes from EACL 2026 coming soon

    22 min

About

Women in AI Research (WiAIR) is a podcast dedicated to celebrating the remarkable contributions of female AI researchers from around the globe. Our mission is to challenge the prevailing perception that AI research is predominantly male-driven. Our goal is to empower early career researchers, especially women, to pursue their passion for AI and make an impact in this rapidly growing field. You will learn from women at different career stages, stay updated on the latest research and advancements, and hear powerful stories of overcoming obstacles and breaking stereotypes.