[人人能懂] 如何打造一个我们能真正信赖的AI？

00:00:27 AI的“心术”：我们能给它装上一个诚实开关吗？

00:05:08 比正确更重要的，是正确地思考

00:10:18 省钱的艺术：如何让“实习生”干好“专家”的活？

00:15:44 AI的“一键删除”，真的能删除吗？

00:21:08 AI训练的“终局思维”：把答案直接告诉你

本期介绍的几篇论文：

[LG] Can LLMs Lie? Investigation beyond Hallucination

[Carnegie Mellon University (CMU)]

https://arxiv.org/abs/2509.03518

---

[LG] Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

[Amazon]

https://arxiv.org/abs/2509.03403

---

[LG] Cut Costs, Not Accuracy: LLM-Powered Data Processing with Guarantees

[University of California, Berkeley]

https://arxiv.org/abs/2509.02896

---

[LG] Unlearning That Lasts: Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs

[University of Tübingen & EPFL]

https://arxiv.org/abs/2509.02820

---

[LG] Imitate Optimal Policy: Prevail and Induce Action Collapse in Policy Gradient

[University of Sydney & King Abdullah University of Science and Technology]

https://arxiv.org/abs/2509.02737

資訊