قبل يومين
٦ من الدقائق

2025.08.15 | 数学推理手册提升模型能力；连续令牌生成图像模型

本期的 12 篇论文如下：

[00:23] 📚 We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning（We-Math 2.0：一个激励视觉数学推理的多功能数学手册系统）

[00:50] 🚀 NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale（NextStep-1：迈向大规模连续令牌自回归图像生成）

[01:17] 🎨 ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing（ToonComposer：通过生成式关键帧后处理简化卡通制作）

[01:43] 🤔 PRELUDE: A Benchmark Designed to Require Global Comprehension and Reasoning over Long Contexts（PRELUDE：一个旨在要求长上下文全局理解与推理的基准）

[02:14] 🚀 UI-Venus Technical Report: Building High-performance UI Agents with RFT（UI-Venus技术报告：采用RFT构建高性能UI智能体）

[02:42] 🚀 STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer（STream3R：基于因果Transformer的可扩展序列三维重建）

[03:11] ⚖ Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models（Pass@k 训练：自适应平衡大型推理模型的探索与利用）

[03:37] 🤔 HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs（HumanSense：通过推理型多模态大语言模型实现从多模态感知到共情语境感知响应）

[04:08] 📚 A Survey on Diffusion Language Models（扩散语言模型综述）

[04:39] 💡 From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms（从黑箱到透明：在大学课堂中利用可解释人工智能提升自动化口译评估）

[05:03] 📸 Processing and acquisition traces in visual encoders: What does CLIP know about your camera?（视觉编码器中的处理与采集痕迹：CLIP对你的相机了解多少？）

[05:30] ⚖ When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing（当可解释性遇上隐私：后验可解释性与差分隐私在自然语言处理背景下的交集研究）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

صفحة الويب الخاصة بالحلقة

البرنامج

HuggingFace 每日AI论文速递
معدل البث

يتم التحديث يوميًا
تاريخ النشر

١٦ أغسطس ٢٠٢٥ في ١٢:٠٠ ص UTC
مدة الحلقة

٦ من الدقائق
التقييم

ملائم

2025.08.15 | 数学推理手册提升模型能力；连续令牌生成图像模型

المعلومات