本期的 14 篇论文如下:
[00:22] 🧠 Parallel-R1: Towards Parallel Thinking via Reinforcement Learning(Parallel-R1: 通过强化学习实现并行思维)
[00:50] 🔍 Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search(Mini-o3:扩展视觉搜索中的推理模式与交互轮次)
[01:15] 👁 Visual Representation Alignment for Multimodal Large Language Models(多模态大语言模型的视觉表征对齐)
[01:54] 🔄 Reconstruction Alignment Improves Unified Multimodal Models(重建对齐改进统一多模态模型)
[02:19] 🔄 UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward(UMO:通过匹配奖励扩展图像定制中的多身份一致性)
[02:46] 🧠 Curia: A Multi-Modal Foundation Model for Radiology(Curia:一种用于放射学的多模态基础模型)
[03:06] 🔮 F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions(F1:一种连接理解与生成到行动的视觉-语言-行动模型)
[03:33] 🧠 Staying in the Sweet Spot: Responsive Reasoning Evolution via Capability-Adaptive Hint Scaffolding(保持在最佳状态:通过能力自适应提示脚手架实现响应式推理进化)
[03:56] 🔄 Language Self-Play For Data-Free Training(语言自我博弈用于无数据训练)
[04:22] 🔍 Causal Attention with Lookahead Keys(带前瞻键的因果注意力)
[04:43] 🎨 Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference(直接将完整扩散轨迹与细粒度人类偏好对齐)
[05:07] ✅ SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge(SimpleQA Verified:衡量参数化知识的可靠事实性基准)
[05:30] 🚀 Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling(Q-Sched:通过量化感知调度推动少步扩散模型的边界)
[06:01] 📈 $ΔL$ Normalization: Rethink Loss Aggregation in RLVR($ΔL$ 归一化:重新思考RLVR中的损失聚合)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
Thông Tin
- Chương trình
- Tần suấtHằng ngày
- Đã xuất bảnlúc 23:00 UTC 10 tháng 9, 2025
- Thời lượng7 phút
- Xếp hạngSạch