本期我们要聊一个核心问题:我们总觉得AI是个神秘的黑箱,但最新的研究正在像做“脑部扫描”一样,试图撬开它。我们将看到,一个“满分或零分”的简单规则,就能教会AI诚实;又如何派出一个“AI侦探”,揪出潜伏的恶意模型。接着,我们会深入AI的“思考过程”,看看聪明的“大脑”和聪明的“搜索引擎”哪个更重要,以及如何让AI通过“犯错”来演化出正确答案,甚至把它的复杂推理拆解成一个个可以遥控的“思想积木”。准备好了吗?让我们一起深入AI的内心世界。
00:00:41 AI的“不说谎”训练:满分或零分
00:05:29 AI界的“无间道”:如何揪出披着羊皮的狼?
00:10:39 聪明的大脑,和聪明的搜索引擎,哪个更重要?
00:16:14 犯错没关系,只要你“改对”的概率比“改错”大一点点
00:21:22 拆解AI大脑:它思考时在想什么?
本期介绍的几篇论文:
[CL] Train for Truth,Keep the Skills:Binary Retrieval-Augmented Reward Mitigates Hallucinations
[University of Washington & Allen Institute for AI (Ai2)]
https://arxiv.org/abs/2510.17733
---
[LG] Detecting Adversarial Fine-tuning with Auditing Agents
[Anthropic]
https://arxiv.org/abs/2510.16255
---
[LG] Prior Makes It Possible:From Sublinear Graph Algorithms to LLM Test-Time Methods
[Toyota Technological Institute at Chicago & Columbia University & Google Research]
https://arxiv.org/abs/2510.16609
---
[CL] Deep Self-Evolving Reasoning
[Microsoft Research Asia & Peking University]
https://arxiv.org/abs/2510.17498
---
[LG] Algorithmic Primitives and Compositional Geometry of Reasoning in Language Models
[Columbia University & University of California Los Angeles & Harvey Mudd College]
https://arxiv.org/abs/2510.15987
Informações
- Podcast
- FrequênciaDiário
- Publicado21 de outubro de 2025 às 23:44 UTC
- Duração27min
- ClassificaçãoLivre