[人人能懂] 从“无效努力”到“学习快车道”

00:00:26 青出于蓝：机器如何超越它的老师？

00:05:39 AI“学坏”实录：从小聪明到大隐患

00:09:35 AI大模型里的“无效努力”：我们该如何唤醒沉睡的智慧？

00:14:48 给AI做个“脑CT”，我们发现了什么？

00:18:25 AI学习的快车道：如何不让“平均”抹杀“个性”

本期介绍的几篇论文：

[LG] A Taxonomy of Transcendence

[Harvard University]

https://arxiv.org/abs/2508.17669

---

[LG] School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs

[Center on Long-term Risk & Truthful AI]

https://arxiv.org/abs/2508.17511

---

[LG] Attention Layers Add Into Low-Dimensional Residual Subspaces

[Shanghai Innovation Institute & Fudan University]

https://arxiv.org/abs/2508.16929

---

[LG] Unraveling the cognitive patterns of Large Language Models through module communities

[Rensselaer Polytechnic Institute & IBM Research]

https://arxiv.org/abs/2508.18192

---

[LG] Fisher-Orthogonal Projection Methods for Natural Gradient Descent with Large Batches

[University of Oxford]

https://arxiv.org/abs/2508.13898

信息