HuggingFace 每日AI论文速递

2025.09.01 | R-4B模型优化思考效率;EO-1提升机器人控制能力

本期的 15 篇论文如下:

[00:24] 🧠 R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning(R-4B: 通过双模式退火和强化学习激励多模态大语言模型的通用自动思考能力)

[00:59] 🤖 EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control(具身一体视觉:交错视觉-文本-动作预训练用于通用机器人控制)

[01:29] 🔒 A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code(A.S.E:一个用于评估AI生成代码安全的仓库级基准)

[01:57] 🎥 Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation(Droplet3D:视频中的常识先验促进3D生成)

[02:26] 🗣 TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis(TalkVid: 一个用于音频驱动说话头部合成的大规模多样化数据集)

[02:58] 🤖 A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers(科学大型语言模型综述:从数据基础到智能体前沿)

[03:28] 🤖 UItron: Foundational GUI Agent with Advanced Perception and Planning(UItron:具有先进感知和规划能力的基础GUI代理)

[03:50] 🎮 Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models(在游戏中思考:通过强化学习与大型语言模型学习游戏推理)

[04:20] 🔄 TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training(TiKMiX:将数据影响力纳入语言模型预训练的动态混合)

[04:45] 💻 Efficient Code Embeddings from Code Generation Models(来自代码生成模型的高效代码嵌入)

[05:10] ⏸ Morae: Proactively Pausing UI Agents for User Choices(Morae: 主动暂停UI代理以供用户选择)

[05:37] 🔍 AHELM: A Holistic Evaluation of Audio-Language Models(AHELM:音频语言模型的全面评估)

[06:05] 🤖 HERMES: Human-to-Robot Embodied Learning from Multi-Source Motion Data for Mobile Dexterous Manipulation(HERMES: 基于多源运动数据的人到机器人具身学习用于移动灵巧操作)

[06:34] 🔄 Model-Task Alignment Drives Distinct RL Outcomes(模型-任务对齐驱动强化学习的差异化结果)

[07:08] 👁 Mimicking the Physicist's Eye:A VLM-centric Approach for Physics Formula Discovery(模仿物理学家的眼睛:一种以视觉语言模型为中心的物理公式发现方法)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递