AI可可AI生活

[人人能懂] 从战争迷雾、自我进化到快马绣花

今天我们不只聊AI能做什么,而是要深入它的“内心”,看看它是如何思考和学习的。我们将一起探索,AI如何在信息不足的“战争迷雾”中做出超人决策,又如何像滚雪球一样自我进化,解决超长难题。我们还会看到,AI如何被“逼”着建立起真正的内心世界,甚至它的训练过程,竟然能用中学物理的理想气体定律来解释!准备好了吗?让我们一起揭开这些聪明策略的神秘面纱,看看AI如何实现从“死记硬背”到“融会贯通”,从“慢工细活”到“快马绣花”的华丽变身。

00:00:39 信息不足,如何做出“超人”决策?

00:06:35 AI如何学会“举一反三”?

00:12:54 AI的“顿悟”:如何让机器不只记忆,更懂世界

00:17:08 AI炼丹炉里的理想气体

00:22:22 AI训练的“既要又要”:如何让快马也能绣花?

本期介绍的几篇论文:

[LG] Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search

[CMU & NYU Tandon School of Engineering & Stanford University]

https://arxiv.org/abs/2511.07312

---

[LG] Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization

[University of Pennsylvania & CMU]

https://arxiv.org/abs/2511.07378

---

[LG] Next-Latent Prediction Transformers Learn Compact World Models

[Microsoft Research]

https://arxiv.org/abs/2511.05963

---

[LG] Can Training Dynamics of Scale-Invariant Neural Networks Be Explained by the Thermodynamics of an Ideal Gas?

[Constructor University & Mila]

https://arxiv.org/abs/2511.07308

---

[LG] TNT: Improving Chunkwise Training for Test-Time Memorization

[Google Research & University of Southern California]

https://arxiv.org/abs/2511.07343