本期的 15 篇论文如下:
[00:16] ✨ On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification(关于SFT泛化性的研究:一个基于奖励修正的强化学习视角)
[00:41] 🌱 R-Zero: Self-Evolving Reasoning LLM from Zero Data(R-Zero:零数据自演进推理大语言模型)
[01:00] 🤖 Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation(Genie Envisioner:一个用于机器人操作的统一世界基础平台)
[01:27] 🤔 DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning(DeepPHY:具身视觉语言模型物理推理基准测试)
[01:49] 📊 Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity(Hi3DEval:基于分层有效性的3D生成评估进展)
[02:12] 🤔 Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?(文档检索增强生成评估:我们走在正确的道路上吗?)
[02:40] 🔍 Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability(大型多模态模型能否主动识别有缺陷的输入?一项对其输入审查能力的系统性评估框架)
[03:08] 💡 Are Today's LLMs Ready to Explain Well-Being Concepts?(当今大型语言模型能否胜任解释幸福感概念?)
[03:30] 🚀 CoAct-1: Computer-using Agents with Coding as Actions(CoAct-1:以编程为行动的计算机操作代理)
[03:57] 🚀 InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities(InfiAlign:可扩展、样本高效的LLM推理能力对齐框架)
[04:18] 💬 Evaluating, Synthesizing, and Enhancing for Customer Support Conversation(评估、合成与提升客户支持对话)
[04:41] 💡 Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models(拒绝过度思考:高效R1风格大型推理模型综述)
[05:02] 🤯 MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes(MOSEv2:复杂场景视频目标分割的更具挑战性数据集)
[05:22] 🎤 Marco-Voice Technical Report(Marco-Voice 技术报告)
[05:47] 🎨 StrandDesigner: Towards Practical Strand Generation with Sketch Guidance(StrandDesigner:迈向草图引导的实用毛发生成)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
Information
- Show
- FrequencyUpdated Daily
- PublishedAugust 9, 2025 at 12:00 AM UTC
- Length7 min
- RatingClean