This episode explores Moonshot AI’s KIMI K2.5, a 2026 multimodal model that aims to improve text reasoning, visual understanding, and agent-style task execution within a single open system. It explains the paper’s two main bets: training text and vision jointly from early stages rather than bolting vision on later, and using an external “agent swarm” orchestration layer to split wide-search tasks across parallel sub-agents. The discussion compares these ideas to earlier vision-language systems and multi-agent frameworks, while also questioning whether the reported gains in quality, latency, and cross-modal robustness are fully supported by the evidence. Listeners would find it interesting for its clear breakdown of where the real novelty lies: not a new transformer architecture, but a systems design argument about how future AI models may combine multimodal learning with distributed task coordination. Sources: 1. Kimi K2.5: Visual Agentic Intelligence — Kimi Team, Tongtong Bai, Yifan Bai, Yiping Bao, S. H. Cai, Yuan Cao, Y. Charles, H. S. Che, Cheng Chen, Guanduo Chen, Huarong Chen, Jia Chen, Jiahao Chen, Jianlong Chen, Jun Chen, Kefan Chen, Liang Chen, Ruijue Chen, Xinhao Chen, Yanru Chen, Yanxu Chen, Yicun Chen, Yimin Chen, Yingjiang Chen, Yuankun Chen, Yujie Chen, Yutian Chen, Zhirong Chen, Ziwei Chen, Dazhi Cheng, Minghan Chu, Jialei Cui, Jiaqi Deng, Muxi Diao, Hao Ding, Mengfan Dong, Mengnan Dong, Yuxin Dong, Yuhao Dong, Angang Du, Chenzhuang Du, Dikang Du, Lingxiao Du, Yulun Du, Yu Fan, Shengjun Fang, Qiulin Feng, Yichen Feng, Garimugai Fu, Kelin Fu, Hongcheng Gao, Tong Gao, Yuyao Ge, Shangyi Geng, Chengyang Gong, Xiaochen Gong, Zhuoma Gongque, Qizheng Gu, Xinran Gu, Yicheng Gu, Longyu Guan, Yuanying Guo, Xiaoru Hao, Weiran He, Wenyang He, Yunjia He, Chao Hong, Hao Hu, Jiaxi Hu, Yangyang Hu, Zhenxing Hu, Ke Huang, Ruiyuan Huang, Weixiao Huang, Zhiqi Huang, Tao Jiang, Zhejun Jiang, Xinyi Jin, Yu Jing, Guokun Lai, Aidi Li, C. Li, Cheng Li, Fang Li, Guanghe Li, Guanyu Li, Haitao Li, Haoyang Li, Jia Li, Jingwei Li, Junxiong Li, Lincan Li, Mo Li, Weihong Li, Wentao Li, Xinhang Li, Xinhao Li, Yang Li, Yanhao Li, Yiwei Li, Yuxiao Li, Zhaowei Li, Zheming Li, Weilong Liao, Jiawei Lin, Xiaohan Lin, Zhishan Lin, Zichao Lin, Cheng Liu, Chenyu Liu, Hongzhang Liu, Liang Liu, Shaowei Liu, Shudong Liu, Shuran Liu, Tianwei Liu, Tianyu Liu, Weizhou Liu, Xiangyan Liu, Yangyang Liu, Yanming Liu, Yibo Liu, Yuanxin Liu, Yue Liu, Zhengying Liu, Zhongnuo Liu, Enzhe Lu, Haoyu Lu, Zhiyuan Lu, Junyu Luo, Tongxu Luo, Yashuo Luo, Long Ma, Yingwei Ma, Shaoguang Mao, Yuan Mei, Xin Men, Fanqing Meng, Zhiyong Meng, Yibo Miao, Minqing Ni, Kun Ouyang, Siyuan Pan, Bo Pang, Yuchao Qian, Ruoyu Qin, Zeyu Qin, Jiezhong Qiu, Bowen Qu, Zeyu Shang, Youbo Shao, Tianxiao Shen, Zhennan Shen, Juanfeng Shi, Lidong Shi, Shengyuan Shi, Feifan Song, Pengwei Song, Tianhui Song, Xiaoxi Song, Hongjin Su, Jianlin Su, Zhaochen Su, Lin Sui, Jinsong Sun, Junyao Sun, Tongyu Sun, Flood Sung, Yunpeng Tai, Chuning Tang, Heyi Tang, Xiaojuan Tang, Zhengyang Tang, Jiawen Tao, Shiyuan Teng, Chaoran Tian, Pengfei Tian, Ao Wang, Bowen Wang, Chensi Wang, Chuang Wang, Congcong Wang, Dingkun Wang, Dinglu Wang, Dongliang Wang, Feng Wang, Hailong Wang, Haiming Wang, Hengzhi Wang, Huaqing Wang, Hui Wang, Jiahao Wang, Jinhong Wang, Jiuzheng Wang, Kaixin Wang, Linian Wang, Qibin Wang, Shengjie Wang, Shuyi Wang, Si Wang, Wei Wang, Xiaochen Wang, Xinyuan Wang, Yao Wang, Yejie Wang, Yipu Wang, Yiqin Wang, Yucheng Wang, Yuzhi Wang, Zhaoji Wang, Zhaowei Wang, Zhengtao Wang, Zhexu Wang, Zihan Wang, Zizhe Wang, Chu Wei, Ming Wei, Chuan Wen, Zichen Wen, Chengjie Wu, Haoning Wu, Junyan Wu, Rucong Wu, Wenhao Wu, Yuefeng Wu, Yuhao Wu, Yuxin Wu, Zijian Wu, Chenjun Xiao, Jin Xie, Xiaotong Xie, Yuchong Xie, Yifei Xin, Bowei Xing, Boyu Xu, Jianfan Xu, Jing Xu, Jinjing Xu, L. H. Xu, Lin Xu, Suting Xu, Weixin Xu, Xinbo Xu, Xinran Xu, Yangchuan Xu, Yichang Xu, Yuemeng Xu, Zelai Xu, Ziyao Xu, Junjie Yan, Yuzi Yan, Guangyao Yang, Hao Yang, Junwei Yang, Kai Yang, Ningyuan Yang, Ruihan Yang, Xiaofei Yang, Xinlong Yang, Ying Yang, Yi Yang, Yi Yang, Zhen Yang, Zhilin Yang, Zonghan Yang, Haotian Yao, Dan Ye, Wenjie Ye, Zhuorui Ye, Bohong Yin, Chengzhen Yu, Longhui Yu, Tao Yu, Tianxiang Yu, Enming Yuan, Mengjie Yuan, Xiaokun Yuan, Yang Yue, Weihao Zeng, Dunyuan Zha, Haobing Zhan, Dehao Zhang, Hao Zhang, Jin Zhang, Puqi Zhang, Qiao Zhang, Rui Zhang, Xiaobin Zhang, Y. Zhang, Yadong Zhang, Yangkun Zhang, Yichi Zhang, Yizhi Zhang, Yongting Zhang, Yu Zhang, Yushun Zhang, Yutao Zhang, Yutong Zhang, Zheng Zhang, Chenguang Zhao, Feifan Zhao, Jinxiang Zhao, Shuai Zhao, Xiangyu Zhao, Yikai Zhao, Zijia Zhao, Huabin Zheng, Ruihan Zheng, Shaojie Zheng, Tengyang Zheng, Junfeng Zhong, Longguang Zhong, Weiming Zhong, M. Zhou, Runjie Zhou, Xinyu Zhou, Zaida Zhou, Jinguo Zhu, Liya Zhu, Xinhao Zhu, Yuxuan Zhu, Zhen Zhu, Jingze Zhuang, Weiyu Zhuang, Ying Zou, Xinxing Zu, 2026 http://arxiv.org/abs/2602.02276 2. Large Language Model Based Multi-agents: A Survey of Progress and Challenges — Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, Xiangliang Zhang, 2024 https://scholar.google.com/scholar?q=Large+Language+Model+Based+Multi-agents:+A+Survey+of+Progress+and+Challenges 3. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework — Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Shaokun Zhang, Erkang Zhu, Beibin Li, Li Jiang, Xiaoyun Zhang, Chi Wang, et al., 2023 https://scholar.google.com/scholar?q=AutoGen:+Enabling+Next-Gen+LLM+Applications+via+Multi-Agent+Conversation+Framework 4. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework — Sirui Hong, Xiawu Zheng, Jiaqi Chen, Yuheng Cheng, Ceyao Zhang, Jinlin Wang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, et al., 2023 https://scholar.google.com/scholar?q=MetaGPT:+Meta+Programming+for+A+Multi-Agent+Collaborative+Framework 5. Kimi K2.5: Visual Agentic Intelligence — Kimi Team (including Tongtong Bai, Yifan Bai, Yiping Bao, et al.), 2026 https://scholar.google.com/scholar?q=Kimi+K2.5:+Visual+Agentic+Intelligence 6. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning — Richard S. Sutton, Doina Precup, Satinder Singh, 1999 https://scholar.google.com/scholar?q=Between+MDPs+and+Semi-MDPs:+A+Framework+for+Temporal+Abstraction+in+Reinforcement+Learning 7. The Option-Critic Architecture — Pierre-Luc Bacon, Jean Harb, Doina Precup, 2017 https://scholar.google.com/scholar?q=The+Option-Critic+Architecture 8. ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL — Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar, 2024 https://scholar.google.com/scholar?q=ArCHer:+Training+Language+Model+Agents+via+Hierarchical+Multi-Turn+RL 9. BrowseComp: a Simple Yet Challenging Benchmark for Browsing Agents — Jason Wei, Zhiqing Sun, Siawsh Papay, Sam McKinney, et al., 2025 https://scholar.google.com/scholar?q=BrowseComp:+a+Simple+Yet+Challenging+Benchmark+for+Browsing+Agents 10. WideSearch: Benchmarking Agentic Broad Info-Seeking — Runjing Wong, Jiawei Wang, Jiahui Zhao, et al., 2025 https://scholar.google.com/scholar?q=WideSearch:+Benchmarking+Agentic+Broad+Info-Seeking 11. ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization — Xiaokang Wu, Kaixuan Li, Yuxin Zhao, et al., 2025 https://scholar.google.com/scholar?q=ReSum:+Unlocking+Long-Horizon+Search+Intelligence+via+Context+Summarization 12. OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments — Tianbao Xie, Ding Zhang, Junnan Chen, et al., 2024 https://scholar.google.com/scholar?q=OSWorld:+Benchmarking+Multimodal+Agents+for+Open-Ended+Tasks+in+Real+Computer+Environments 13. Qwen3-VL Technical Report — Sheng Bai, Yicong Cai, Ruijie Chen, et al., 2025 https://scholar.google.com/scholar?q=Qwen3-VL+Technical+Report 14. Thinking with images for multimodal reasoning: Foundations, methods, and future frontiers — not confirmed in provided snippet, recent (not confirmed from snippet) https://scholar.google.com/scholar?q=Thinking+with+images+for+multimodal+reasoning:+Foundations,+methods,+and+future+frontiers 15. Vision-deepresearch: Incentivizing deepresearch capability in multimodal large language models — not confirmed in provided snippet, recent (not confirmed from snippet) https://scholar.google.com/scholar?q=Vision-deepresearch:+Incentivizing+deepresearch+capability+in+multimodal+large+language+models 16. Momentor: Advancing video large language model with fine-grained temporal reasoning — not confirmed in provided snippet, recent (not confirmed from snippet) https://scholar.google.com/scholar?q=Momentor:+Advancing+video+large+language+model+with+fine-grained+temporal+reasoning 17. Temporal reasoning transfer from text to video — not confirmed in provided snippet, recent (not confirmed from snippet) https://scholar.google.com/scholar?q=Temporal+reasoning+transfer+from+text+to+video 18. Videoinsta: Zero-shot long video understanding via informative spatial-temporal reasoning with LLMs — not confirmed in provided snippet, recent (not confirmed from snippet) https://scholar.google.com/scholar?q=Videoinsta:+Zero-shot+long+video+understanding+via+informative+spatial-temporal+reasoning+with+LLMs 19. AI Post Transformers: Test-time Scaling for Multi-Agent Collaborative Reasoning — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-22-test-time-scaling-for-multi-agent-collab-082570.mp3 20. AI Post Transformers: Experimental Comparison of Agentic and Enhanced RAG — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-14-experimental-comparison-of-agentic-and-e-37d8bc.mp3 21. A