HuggingFace 每日AI论文速递

duan

5.0 (2)
TECHNOLOGY
UPDATED DAILY

每天10分钟，带您快速了解当日HuggingFace热门AI论文内容。每个工作日更新，欢迎订阅。 📢播客节目在小宇宙、Apple Podcast平台搜索【HuggingFace 每日AI论文速递】 🖼另外还有图文版，可在小红书搜索并关注【AI速递】

18H AGO

2025.09.11 | 强化学习提升推理能力；奖励缩放优化视觉生成

本期的 10 篇论文如下： [00:24] 🧠 A Survey of Reinforcement Learning for Large Reasoning Models（大型推理模型的强化学习综述） [00:45] 🔄 RewardDance: Reward Scaling in Visual Generation（RewardDance：视觉生成中的奖励缩放） [01:08] 🌐 3D and 4D World Modeling: A Survey（3D和4D世界建模：一项综述） [01:41] 🤖 AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning（AgentGym-RL: 通过多轮强化学习训练用于长视野决策制定的LLM智能体） [02:08] 🧩 P3-SAM: Native 3D Part Segmentation（P3-SAM：原生3D部分分割） [02:40] 🌐 Hunyuan-MT Technical Report（Hunyuan-MT技术报告） [03:08] ⚠ So let's replace this phrase with insult... Lessons learned from generation of toxic texts with LLMs（从LLM生成有毒文本中吸取的经验教训） [03:44] 🤖 EnvX: Agentize Everything with Agentic AI（EnvX：使用代理式AI实现万物代理化） [04:13] 🤔 The Majority is not always right: RL training for solution aggregation（多数并不总是正确：用于解决方案聚合的强化学习训练） [04:33] 🤖 HumanAgencyBench: Scalable Evaluation of Human Agency Support in AI Assistants（HumanAgencyBench：AI助手中人类代理支持的规模化评估）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

6 min
1D AGO

2025.09.10 | 强化学习并行思维；视觉搜索推理扩展

本期的 14 篇论文如下： [00:22] 🧠 Parallel-R1: Towards Parallel Thinking via Reinforcement Learning（Parallel-R1: 通过强化学习实现并行思维） [00:50] 🔍 Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search（Mini-o3：扩展视觉搜索中的推理模式与交互轮次） [01:15] 👁 Visual Representation Alignment for Multimodal Large Language Models（多模态大语言模型的视觉表征对齐） [01:54] 🔄 Reconstruction Alignment Improves Unified Multimodal Models（重建对齐改进统一多模态模型） [02:19] 🔄 UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward（UMO：通过匹配奖励扩展图像定制中的多身份一致性） [02:46] 🧠 Curia: A Multi-Modal Foundation Model for Radiology（Curia：一种用于放射学的多模态基础模型） [03:06] 🔮 F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions（F1：一种连接理解与生成到行动的视觉-语言-行动模型） [03:33] 🧠 Staying in the Sweet Spot: Responsive Reasoning Evolution via Capability-Adaptive Hint Scaffolding（保持在最佳状态：通过能力自适应提示脚手架实现响应式推理进化） [03:56] 🔄 Language Self-Play For Data-Free Training（语言自我博弈用于无数据训练） [04:22] 🔍 Causal Attention with Lookahead Keys（带前瞻键的因果注意力） [04:43] 🎨 Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference（直接将完整扩散轨迹与细粒度人类偏好对齐） [05:07] ✅ SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge（SimpleQA Verified：衡量参数化知识的可靠事实性基准） [05:30] 🚀 Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling（Q-Sched：通过量化感知调度推动少步扩散模型的边界） [06:01] 📈 $ΔL$ Normalization: Rethink Loss Aggregation in RLVR（$ΔL$ 归一化：重新思考RLVR中的损失聚合）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

7 min
2D AGO

2025.09.09 | REER提升推理性能；WebExplorer训练智能体

本期的 15 篇论文如下： [00:21] 💡 Reverse-Engineered Reasoning for Open-Ended Generation（面向开放式生成的逆向工程推理） [00:47] 🌐 WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents（WebExplorer：探索与演进，用于训练长周期网络智能体） [01:17] 🚀 Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models（革新扩散大语言模型的强化学习框架） [01:38] 🤔 Does DINOv3 Set a New Medical Vision Standard?（DINOv3 能否树立医学视觉新标准？） [02:06] 🛠 Reinforced Visual Perception with Tools（基于工具的强化视觉感知） [02:26] 🤖 Reinforcement Learning Foundations for Deep Research Systems: A Survey（深度研究系统中的强化学习基础：综述） [02:55] 👁 Focusing by Contrastive Attention: Enhancing VLMs' Visual Reasoning（通过对比注意力聚焦：增强VLM的视觉推理能力） [03:28] 🎥 UniVerse-1: Unified Audio-Video Generation via Stitching of Experts（UniVerse-1：通过专家模型拼接实现统一音视频生成） [03:50] 🤔 Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?（绘画易于思考：文生图模型能布景，但无法主导剧情吗？） [04:12] 🤔 Interleaving Reasoning for Better Text-to-Image Generation（通过交错推理提升文本到图像生成） [04:37] 🤖 Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents（Paper2Agent：将研究论文重构为交互式可靠的AI代理） [05:05] ⚙ Guided Decoding and Its Critical Role in Retrieval-Augmented Generation（引导式解码及其在检索增强生成中的关键作用） [05:36] 🚀 Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers（扩展用于大型语言模型分步证明器的多轮离策略强化学习和多智能体树搜索） [06:04] 🛡 \texttt{R$^\textbf{2}$AI}: Towards Resistant and Resilient AI in an Evolving World（R$^2$AI：迈向演进世界中的抵抗性与韧性AI） [06:30] 🌍 Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian（Llama-GENBA-10B：一个德语、英语和巴伐利亚语三语大型语言模型）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

7 min
3D AGO

2025.09.08 | 语言模型幻觉源于预训练；大模型图形编程性能提升

本期的 12 篇论文如下： [00:24] 🤔 Why Language Models Hallucinate（语言模型为何产生幻觉） [00:47] 🎨 Symbolic Graphics Programming with Large Language Models（使用大型语言模型进行符号化图形编程） [01:17] ⚡ Set Block Decoding is a Language Model Inference Accelerator（集合块解码：一种语言模型推理加速器） [01:43] 🎼 WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning（WildScore：多模态大语言模型在真实场景下的符号音乐推理基准测试） [02:14] 🌍 LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation（LatticeWorld：基于多模态大语言模型的交互式复杂世界生成框架） [02:42] 💡 LuxDiT: Lighting Estimation with Video Diffusion Transformer（LuxDiT：基于视频扩散变换器的光照估计） [03:15] 📷 WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool（WinT3R：基于窗口流式重建与相机令牌池） [03:44] 📉 On Robustness and Reliability of Benchmark-Based Evaluation of LLMs（基于基准测试的LLM评估的鲁棒性与可靠性研究） [04:07] 🔍 MedVista3D: Vision-Language Modeling for Reducing Diagnostic Errors in 3D CT Disease Detection, Understanding and Reporting（MedVista3D：用于减少3D CT疾病检测、理解和报告中诊断错误的视觉语言建模） [04:43] 🦾 U-ARM : Ultra low-cost general teleoperation interface for robot manipulation（U-ARM：用于机器人操作的超低成本通用遥操作接口） [05:16] 🔍 Behavioral Fingerprinting of Large Language Models（大型语言模型的行为指纹识别） [05:45] 🚀 Bootstrapping Task Spaces for Self-Improvement（自改进任务空间的引导构建）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

6 min
5D AGO

【周末特辑】9月第2周最火AI论文 | LLM智能体RL综述；AI代码安全基准

本期的 5 篇论文如下： [00:35] TOP1(🔥139) | 🤖 The Landscape of Agentic Reinforcement Learning for LLMs: A Survey（面向大语言模型的智能体强化学习全景：一项综述） [01:52] TOP2(🔥133) | 🔒 A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code（A.S.E：一个用于评估AI生成代码安全的仓库级基准） [02:57] TOP3(🔥127) | 🤖 A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers（科学大型语言模型综述：从数据基础到智能体前沿） [04:15] TOP4(🔥103) | 🧠 R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning（R-4B: 通过双模式退火和强化学习激励多模态大语言模型的通用自动思考能力） [05:11] TOP5(🔥101) | 🤔 Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth（废话学：用深度解读无意义内容挑战大型语言模型）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

7 min
6D AGO

2025.09.05 | 大型语言模型语义理解弱；图像编辑模型提升几何估计

本期的 13 篇论文如下： [00:22] 🤔 Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth（废话学：用深度解读无意义内容挑战大型语言模型） [00:47] 📐 From Editor to Dense Geometry Estimator（从编辑模型到密集几何估计器） [01:08] 🧠 Towards a Unified View of Large Language Model Post-Training（迈向大语言模型后训练的统一视角） [01:39] 🔄 Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?（逆向IFEval：大型语言模型能否摒弃顽固训练惯例以遵循真实指令？） [02:05] 🔬 DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks（深度研究竞技场：基于研讨会任务对大语言模型研究能力的首次考核） [02:26] 🚀 Transition Models: Rethinking the Generative Learning Objective（过渡模型：重新思考生成式学习目标） [02:54] 🔍 NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings（NER检索器：基于类型感知嵌入的零样本命名实体检索） [03:24] ⚡ Few-step Flow for 3D Generation via Marginal-Data Transport Distillation（基于边缘数据传输蒸馏的少步流3D生成方法） [03:53] 🎬 Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding（视频多轮推理：面向长视频理解的强化多轮推理框架） [04:19] 🎭 Durian: Dual Reference-guided Portrait Animation with Attribute Transfer（Durian：基于双参考引导的肖像动画与属性迁移） [04:47] 📐 Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vector Drawings（Drawing2CAD：基于序列到序列学习的矢量绘图CAD生成） [05:24] 🧠 Delta Activations: A Representation for Finetuned Large Language Models（Delta激活：微调大型语言模型的一种表示方法） [06:01] ⚠ False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize（虚假安全感：为何基于探测的恶意输入检测方法难以泛化）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

7 min
SEP 4

2025.09.04 | 机器人任务规划高效；数据推理能力提升

本期的 5 篇论文如下： [00:24] 🤖 Robix: A Unified Model for Robot Interaction, Reasoning and Planning（Robix：一个用于机器人交互、推理和规划的统一模型） [00:54] 🔍 Open Data Synthesis For Deep Research（面向深度研究的开放数据合成） [01:30] 🧠 LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations（LMEnt：一套分析语言模型从预训练数据到表示的知识套件） [02:00] 🧩 MOSAIC: Multi-Subject Personalized Generation via Correspondence-Aware Alignment and Disentanglement（MOSAIC: 基于对应感知对齐和解纠缠的多主体个性化生成） [02:32] 🧠 Planning with Reasoning using Vision Language World Model（基于视觉语言世界模型的规划与推理）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

4 min
SEP 3

2025.09.03 | 智能体RL提升大模型自主性；SimpleTIR解多轮工具推理

本期的 15 篇论文如下： [00:19] 🤖 The Landscape of Agentic Reinforcement Learning for LLMs: A Survey（面向大语言模型的智能体强化学习全景：一项综述） [00:40] 🚀 SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning（SimpleTIR：面向多轮工具集成推理的端到端强化学习） [01:12] 🤖 UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning（UI-TARS-2技术报告：通过多轮强化学习推进GUI代理） [01:41] 🎥 ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding（ELV-Halluc：长视频理解中的语义聚合幻觉基准测试） [02:12] 🔄 LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model（LLaVA-Critic-R1：你的评论模型其实是一个强大的策略模型） [02:43] 🔧 VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use（VerlTool：迈向整体性代理强化学习与工具使用） [03:11] 📄 POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion（POINTS-Reader：无蒸馏适配的视觉-语言模型用于文档转换） [03:33] 🩺 Baichuan-M2: Scaling Medical Capability with Large Verifier System（百川-M2：通过大规模验证系统扩展医疗能力） [03:57] 🎥 Kwai Keye-VL 1.5 Technical Report（快手 Keye-VL 1.5 技术报告） [04:20] 🤖 Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR（通过监督学习框架实现隐式Actor-Critic耦合用于RLVR） [04:45] 🧠 Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic（推理向量：通过任务算术传递思维链能力） [05:11] 🔄 Jointly Reinforcing Diversity and Quality in Language Model Generations（在语言模型生成中联合强化多样性与质量） [05:42] 🚀 DCPO: Dynamic Clipping Policy Optimization（DCPO: 动态裁剪策略优化） [06:04] 🚀 OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning（OpenVision 2：用于多模态学习的生成式预训练视觉编码器系列） [06:27] 🎬 GenCompositor: Generative Video Compositing with Diffusion Transformer（GenCompositor：基于扩散变换器的生成式视频合成）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

7 min

See All (381)

out of 5

2 Ratings

支持！！

Feb 16

Fergie.W

希望能一直做下去

Creator

duan
Years Active

2024 - 2025
Episodes

381
Rating

Clean
Show Website

HuggingFace 每日AI论文速递

Technology

Technology

Updated Weekly
Technology

Technology

Updated Biweekly
Technology

Technology

Updated Biweekly
Technology

Technology

Updated Weekly
Investing

Investing

Updated Weekly
Business

Business

Updated Daily
Entrepreneurship

Entrepreneurship

Updated Weekly

HuggingFace 每日AI论文速递

2025.09.11 | 强化学习提升推理能力；奖励缩放优化视觉生成

2025.09.10 | 强化学习并行思维；视觉搜索推理扩展

2025.09.09 | REER提升推理性能；WebExplorer训练智能体

2025.09.08 | 语言模型幻觉源于预训练；大模型图形编程性能提升

【周末特辑】9月第2周最火AI论文 | LLM智能体RL综述；AI代码安全基准

2025.09.05 | 大型语言模型语义理解弱；图像编辑模型提升几何估计

2025.09.04 | 机器人任务规划高效；数据推理能力提升

2025.09.03 | 智能体RL提升大模型自主性；SimpleTIR解多轮工具推理

Ratings & Reviews

支持！！

About

Information

You Might Also Like

HuggingFace 每日AI论文速递

Episodes

2025.09.11 | 强化学习提升推理能力；奖励缩放优化视觉生成

2025.09.10 | 强化学习并行思维；视觉搜索推理扩展

2025.09.09 | REER提升推理性能；WebExplorer训练智能体

2025.09.08 | 语言模型幻觉源于预训练；大模型图形编程性能提升

【周末特辑】9月第2周最火AI论文 | LLM智能体RL综述；AI代码安全基准

2025.09.05 | 大型语言模型语义理解弱；图像编辑模型提升几何估计

2025.09.04 | 机器人任务规划高效；数据推理能力提升

2025.09.03 | 智能体RL提升大模型自主性；SimpleTIR解多轮工具推理

Ratings & Reviews

About

Information

You Might Also Like