00:00:26 高手过招:如何从一堆答案里“攒”出唯一正解?
00:04:59 AI为什么也会“脑补”?
00:10:17 AI刷题的“偏科”陷阱:如何让学霸更有创造力?
00:15:19 一个AI学会做数学题,给了我们什么启发?
00:21:20 目标太多,如何走上“最短路径”?
本期介绍的几篇论文:
[LG] The Majority is not always right: RL training for solution aggregation
[FAIR at Meta & CMU]
https://arxiv.org/abs/2509.06870
---
[LG] From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers
[Mila - Quebec AI Institute]
https://arxiv.org/abs/2509.06938
---
[LG] Outcome-based Exploration for LLM Reasoning
[FAIR at Meta]
https://arxiv.org/abs/2509.06941
---
[LG] Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers
[ByteDance Seed]
https://arxiv.org/abs/2509.06493
---
[LG] Simple Optimizers for Convex Aligned Multi-Objective Optimization
[Meta AI & Technion]
https://arxiv.org/abs/2509.05811
Información
- Programa
- FrecuenciaCada día
- Publicado10 de septiembre de 2025, 12:05 a.m. UTC
- Duración26 min
- ClasificaciónApto