HuggingFace 每日AI论文速递

2025.10.09 | Ming-UniVision统一视觉词表;KV-Cache直连让大模型秒聊

本期的 15 篇论文如下:

[00:21] 🔄 Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer(Ming-UniVision:用统一连续视觉词表打通图像理解与生成)

[00:59] 🧠 Cache-to-Cache: Direct Semantic Communication Between Large Language Models(缓存到缓存:大模型间的直接语义通信)

[01:32] 🌀 Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding(Lumina-DiMOO:面向多模态生成与理解的离散扩散大模型)

[02:07] 🧠 SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models(SHANKS:口语模型边听边想的同步推理框架)

[03:06] 🤖 RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training(RLinf-VLA:面向VLA模型强化学习训练的统一高效框架)

[04:02] 🎬 MATRIX: Mask Track Alignment for Interaction-aware Video Generation(MATRIX:面向交互感知视频生成的掩码轨迹对齐)

[04:51] 🎯 Vibe Checker: Aligning Code Evaluation with Human Preference(Vibe Checker:让代码评估对齐人类偏好)

[05:44] 🤖 Multi-Agent Tool-Integrated Policy Optimization(多智能体工具集成策略优化)

[06:24] 🧠 CALM Before the STORM: Unlocking Native Reasoning for Optimization Modeling(风暴前夜:解锁优化建模原生推理潜能的轻量化矫正框架)

[06:59] ✂ OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot(OBS-Diff:一次性精准剪枝扩散模型)

[07:52] 🧠 Artificial Hippocampus Networks for Efficient Long-Context Modeling(面向高效长上下文建模的人工海马网络)

[08:30] 🔍 Revisiting Long-context Modeling from Context Denoising Perspective(基于上下文降噪视角的长文本建模再审视)

[09:11] 🧠 Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought(推动多语言推理模型:语言混合思维链新范式)

[09:51] 💥 Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention(低精度Transformer训练为何失败:Flash Attention失效机理剖析)

[10:37] ⚡ Native Hybrid Attention for Efficient Sequence Modeling(原生混合注意力高效序列建模)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递