AI可可AI生活

[人人能懂] 从“攒答案”到“补短板”的决策智慧

00:00:26 高手过招:如何从一堆答案里“攒”出唯一正解?

00:04:59 AI为什么也会“脑补”?

00:10:17 AI刷题的“偏科”陷阱:如何让学霸更有创造力?

00:15:19 一个AI学会做数学题,给了我们什么启发?

00:21:20 目标太多,如何走上“最短路径”?

本期介绍的几篇论文:

[LG] The Majority is not always right: RL training for solution aggregation  

[FAIR at Meta & CMU]  

https://arxiv.org/abs/2509.06870  

---

[LG] From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers  

[Mila - Quebec AI Institute]  

https://arxiv.org/abs/2509.06938  

---

[LG] Outcome-based Exploration for LLM Reasoning  

[FAIR at Meta]  

https://arxiv.org/abs/2509.06941  

---

[LG] Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers  

[ByteDance Seed]  

https://arxiv.org/abs/2509.06493  

---

[LG] Simple Optimizers for Convex Aligned Multi-Objective Optimization  

[Meta AI & Technion]  

https://arxiv.org/abs/2509.05811