本期的 15 篇论文如下:
[00:20] ✨ Ovis2.5 Technical Report(Ovis2.5 技术报告)
[00:51] 🧠 ComoRAG: A Cognitive-Inspired Memory-Organized RAG for Stateful Long Narrative Reasoning(ComoRAG:一种认知启发式记忆组织RAG,用于有状态长叙事推理)
[01:14] 🎥 4DNeX: Feed-Forward 4D Generative Modeling Made Easy(4DNeX:前馈4D生成建模轻松实现)
[01:38] ✨ Next Visual Granularity Generation(下一视觉粒度生成)
[01:57] ⚡ Speed Always Wins: A Survey on Efficient Architectures for Large Language Models(速度至上:大型语言模型高效架构综述)
[02:30] 🤔 Has GPT-5 Achieved Spatial Intelligence? An Empirical Study(GPT-5是否已实现空间智能?一项实证研究)
[03:00] 🎮 HeroBench: A Benchmark for Long-Horizon Planning and Structured Reasoning in Virtual Worlds(HeroBench:虚拟世界中长周期规划与结构化推理的基准测试)
[03:26] ❗ When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs(当标点符号至关重要时:大型语言模型提示鲁棒性方法的大规模比较)
[03:56] 🎮 Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model(矩阵游戏 2.0:一个开源、实时、流式的交互式世界模型)
[04:21] 💡 Lumen: Consistent Video Relighting and Harmonious Background Replacement with Video Generative Models(Lumen:基于视频生成模型的一致性视频重打光与和谐背景替换)
[04:47] 🌐 G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration(G-CUT3R:融合相机与深度先验的引导式三维重建)
[05:15] ✨ S^2-Guidance: Stochastic Self Guidance for Training-Free Enhancement of Diffusion Models(S^2-Guidance:扩散模型无训练增强的随机自引导)
[05:49] 👂 Representing Speech Through Autoregressive Prediction of Cochlear Tokens(通过自回归预测耳蜗令牌实现语音表征)
[06:09] 💡 Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping(逆向LLaVA:通过文本到视觉映射消除对齐预训练)
[06:40] 🎬 Precise Action-to-Video Generation Through Visual Action Prompts(通过视觉动作提示实现精确的动作到视频生成)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
ข้อมูล
- รายการ
- ความถี่อัปเดตทุกวัน
- ออกอากาศวันที่20 สิงหาคม 2568 เวลา 0:00 UTC
- ความยาว8 นาที
- การจัดระดับเหมาะสม