5 JUL
24 MIN

【月末特辑】6月最火AI论文 | LLM通过自我反思提升性能；MiniMax-M1高效扩展测试计算。

本期的 10 篇论文如下：

[00:37] TOP1(🔥258) | 💡 Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning（反思、重试、奖励：通过强化学习实现LLM的自我提升）

[02:51] TOP2(🔥249) | 💡 MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention（MiniMax-M1：利用闪电注意力高效扩展测试时计算）

[05:24] TOP3(🔥240) | 🤖 Reinforcement Pre-Training（强化预训练）

[07:54] TOP4(🔥165) | 🧠 Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning（超越80/20法则：高熵少数Token驱动LLM推理的有效强化学习）

[09:53] TOP5(🔥134) | 🕰 Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA（明日依旧为真吗？多语种常青问题分类以提升可信赖的问答系统）

[12:24] TOP6(🔥132) | 🧠 ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models（ProRL：延长的强化学习拓展大型语言模型的推理边界）

[14:50] TOP7(🔥126) | 🧠 Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models（自信即全部：基于语言模型的小样本强化学习微调）

[16:36] TOP8(🔥116) | 🧲 Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights（拖拽式大语言模型：零样本提示到权重）

[18:34] TOP9(🔥108) | 🤖 SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics（SmolVLA：一种用于经济高效型机器人的视觉-语言-动作模型）

[21:05] TOP10(🔥107) | 🩺 Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning（灵枢：用于统一多模态医学理解与推理的通用基础模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

Episode Webpage

Show

HuggingFace 每日AI论文速递
Frequency

Updated daily
Published

5 July 2025 at 08:33 UTC
Length

24 min
Rating

Clean

【月末特辑】6月最火AI论文 | LLM通过自我反思提升性能；MiniMax-M1高效扩展测试计算。

Information