2D AGO
7 MIN

2025.08.08 | 动态微调优推理;零数据自演进强推理

本期的 15 篇论文如下：

[00:16] ✨ On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification（关于SFT泛化性的研究：一个基于奖励修正的强化学习视角）

[00:41] 🌱 R-Zero: Self-Evolving Reasoning LLM from Zero Data（R-Zero：零数据自演进推理大语言模型）

[01:00] 🤖 Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation（Genie Envisioner：一个用于机器人操作的统一世界基础平台）

[01:27] 🤔 DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning（DeepPHY：具身视觉语言模型物理推理基准测试）

[01:49] 📊 Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity（Hi3DEval：基于分层有效性的3D生成评估进展）

[02:12] 🤔 Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?（文档检索增强生成评估：我们走在正确的道路上吗？）

[02:40] 🔍 Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability（大型多模态模型能否主动识别有缺陷的输入？一项对其输入审查能力的系统性评估框架）

[03:08] 💡 Are Today's LLMs Ready to Explain Well-Being Concepts?（当今大型语言模型能否胜任解释幸福感概念？）

[03:30] 🚀 CoAct-1: Computer-using Agents with Coding as Actions（CoAct-1：以编程为行动的计算机操作代理）

[03:57] 🚀 InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities（InfiAlign：可扩展、样本高效的LLM推理能力对齐框架）

[04:18] 💬 Evaluating, Synthesizing, and Enhancing for Customer Support Conversation（评估、合成与提升客户支持对话）

[04:41] 💡 Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models（拒绝过度思考：高效R1风格大型推理模型综述）

[05:02] 🤯 MOSEv2: A More Challenging Dataset for Video Object Segmentation in Complex Scenes（MOSEv2：复杂场景视频目标分割的更具挑战性数据集）

[05:22] 🎤 Marco-Voice Technical Report（Marco-Voice 技术报告）

[05:47] 🎨 StrandDesigner: Towards Practical Strand Generation with Sketch Guidance（StrandDesigner：迈向草图引导的实用毛发生成）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

Episode Webpage

Show

HuggingFace 每日AI论文速递
Frequency

Updated Daily
Published

August 9, 2025 at 12:00 AM UTC
Length

7 min
Rating

Clean

2025.08.08 | 动态微调优推理;零数据自演进强推理

Information