想不想知道,如何通过一场“团体赛”规则,治好AI“偏科”的毛病?又如何不花一分钱,只靠一个古老的“背包问题”,就让AI训练效率飙升40%?甚至,我们将看到一个天生“失明”的AI,如何在想象力测试中击败人类。本期节目,我们将从最新的几篇论文出发,揭示AI如何学会团队合作、聪明用钱,甚至掌握了“蒙眼思考”和“断舍离”的智慧。
00:00:32 AI训练指南:如何让你的“学霸”不偏科?
00:05:50 AI训练的“免费午餐”:如何不花一分钱,让模型变得更聪明?
00:11:57 从“一指禅”到“组合拳”:AI学习的升级之路
00:16:54 AI的“心灵盲区”:它看不见,却想得更明白?
00:21:19 AI的记忆难题:如何优雅地“断舍离”?
本期介绍的几篇论文:
[LG] Polychromic Objectives for Reinforcement Learning
[Stanford University]
https://arxiv.org/abs/2509.25424
---
[LG] Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation
[ByteDance Seed & The Chinese University of Hong Kong]
https://arxiv.org/abs/2509.25849
---
[LG] Learning to Reason as Action Abstractions with Scalable Mid-Training RL
[Apple]
https://arxiv.org/abs/2509.25810
---
[LG] Artificial Phantasia: Evidence for Propositional Reasoning-Based Mental Imagery in Large Language Models
[Northeastern University]
https://arxiv.org/abs/2509.23108
---
[LG] Expected Attention: KV Cache Compression by Estimating Attention from Future Queries Distribution
[NVIDIA & Sapienza University of Rome]
https://arxiv.org/abs/2510.00636
Información
- Programa
- FrecuenciaCada día
- Publicado5 de octubre de 2025, 11:40 p.m. UTC
- Duración26 min
- ClasificaciónApto