AI by AI

Theseus Research

AI generated podcasts based off recent research. Primarily made for my own consumption, but I'm happy to share them.

  1. 22/06/2025

    AI Breakthroughs: Reasoning, Reliability, and Recommendations

    In this episode, we dissect five cutting-edge research papers that are reshaping the AI landscape. We kick things off with a minimal-yet-powerful tweak to reinforcement learning that helps language models overcome performance plateaus and achieve deeper mathematical reasoning. Next, we journey to the manufacturing floor to explore ARKNESS, a hybrid framework that grounds LLMs in verifiable knowledge, eliminating hallucinations in high-stakes CNC machining. Then, we tackle the critical issue of AI safety with the Alignment Quality Index (AQI), a novel metric that evaluates a model's true alignment by analyzing its internal geometry, not just its outputs. We also explore how Visual Grounded Reasoning (VGR) is enabling multimodal models to 'see' and reason about fine-grained visual details more effectively. Finally, we look at RecFound, a new foundation model designed to unify and supercharge the world of recommendation systems. Join us for a deep dive into the innovations making AI more capable, reliable, and secure. References: Reasoning with Exploration: An Entropy Perspective: https://arxiv.org/pdf/2506.14758.pdf Knowledge Graph Fusion with Large Language Models for Accurate, Explainable Manufacturing Process Planning: https://arxiv.org/pdf/2506.13026.pdf Alignment Quality Index (AQI) : Beyond Refusals: AQI as an Intrinsic Alignment Diagnostic via Latent Geometry, Cluster Divergence, and Layer wise Pooled Representations: https://arxiv.org/pdf/2506.13901.pdf VGR: Visual Grounded Reasoning: https://arxiv.org/pdf/2506.11991.pdf Generative Representational Learning of Foundation Models for Recommendation: https://arxiv.org/pdf/2506.11999.pdf

    35 min
  2. 13/06/2025

    From Database Dialects to Diagnostic Deep Reasoning

    Welcome back to AI by AI, where we dissect the latest and greatest in machine learning research. In this episode, we're diving into five groundbreaking papers that are pushing the boundaries of what's possible. We'll explore Rel-LLM, a new framework that finally lets Large Language Models understand complex databases without losing the plot. Then, we'll tackle the challenge of learning from messy, corrupted data and see how a new 'perturbation-quantization' framework provides robust learning guarantees. We also look at SpectRe, a novel method making Graph Neural Networks more expressive by blending topology with spectral data. Shifting to healthcare, we'll discuss a study evaluating how AI can detect cognitive impairment from speech, with surprising results about what matters more—the words you say or how you say them. Finally, we'll cover Reasoning-Aware Reinforcement Learning (RARL), a technique for building powerful and efficient medical AI models that can reason about their diagnoses, even on a single GPU. Tune in for a deep dive into the tech that's shaping our future! References: Large Language Models are Good Relational Learners: https://arxiv.org/pdf/2506.05725.pdf Graph Persistence goes Spectral: https://arxiv.org/pdf/2506.06571.pdf Robust Learnability of Sample-Compressible Distributions under Noisy or Adversarial Perturbations: https://arxiv.org/pdf/2506.06613.pdf CAtCh: Cognitive Assessment through Cookie Thief: https://arxiv.org/pdf/2506.06603.pdf RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints: https://arxiv.org/pdf/2506.06600.pdf

    38 min
  3. 29/05/2025

    REARANK to EMLoC: Five AI Upgrades You Can’t Ignore

    This episode dives into five cutting-edge research papers. First, we explore REARANK, a reinforcement learning-based agent that re-ranks information using explicit reasoning, improving both performance and interpretability. Next, we examine a new task and dataset for detecting inconsistencies in political statements, benchmarking LLMs against human annotations. We then discuss v1, a lightweight extension for MLLMs that enables selective visual revisitation during inference, enhancing multimodal reasoning. Following that, we delve into CoT Monitor +, a framework designed to mitigate deceptive alignment in LLMs by integrating a self-monitoring mechanism into the chain-of-thought reasoning process. Finally, we cover Efficient Multi-Modal Long Context Learning (EMLoC), a training-free method for adapting MLLMs to new tasks by embedding demonstration examples directly into the model input, while efficiently managing long contexts. REARANK: Reasoning Re-ranking Agent via Reinforcement Learning: https://arxiv.org/pdf/2505.20046.pdf Misleading through Inconsistency: A Benchmark for Political Inconsistencies Detection: https://arxiv.org/pdf/2505.19191.pdf Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation: https://arxiv.org/pdf/2505.18842.pdf Mitigating Deceptive Alignment via Self-Monitoring: https://arxiv.org/pdf/2505.18807.pdf Efficient Multi-modal Long Context Learning for Training-free Adaptation: https://arxiv.org/pdf/2505.19812.pdf

    45 min

About

AI generated podcasts based off recent research. Primarily made for my own consumption, but I'm happy to share them.