2025.06.17 | MiniMax-M1提升推理性能;多模态模型认知测试创新。

HuggingFace 每日AI论文速递

本期的 15 篇论文如下:

[00:22] 💡 MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention(MiniMax-M1:利用闪电注意力高效扩展测试时计算)

[01:00] 🔬 Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning(科学家的首次考试:通过感知、理解和推理来探索多模态大型语言模型的认知能力)

[01:47] 🧐 DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents(DeepResearch Bench:一个面向深度研究Agent的综合性评测基准)

[02:28] 🧠 DoTA-RAG: Dynamic of Thought Aggregation RAG(思想动态聚合RAG:一种用于大规模网络知识索引的检索增强生成系统)

[03:08] 🧠 Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning(Ego-R1:用于超长第一视角视频推理的工具链式思考)

[03:52] 💡 Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency(等等,我们不需要“等等”!移除思考Token提升推理效率)

[04:28] 🤖 TaskCraft: Automated Generation of Agentic Tasks(任务工坊:自动化生成自主Agent任务)

[05:04] 🤯 Discrete Diffusion in Large Language and Multimodal Models: A Survey(大型语言和多模态模型中的离散扩散:一项综述)

[05:42] 🪞 Test3R: Learning to Reconstruct 3D at Test Time(Test3R:测试时学习三维重建)

[06:25] 🖼 VGR: Visual Grounded Reasoning(VGR:视觉基础推理)

[07:06] 🤖 PersonaFeedback: A Large-scale Human-annotated Benchmark For Personalization(PersonaFeedback:一个大规模的人工标注的个性化基准)

[07:50] 🤖 From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding(从真实到合成:通过属性化基础生成数百万条多样化且复杂的用户指令)

[08:32] 🤖 BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models(BridgeVLA: 基于输入-输出对齐的视觉-语言模型高效3D操作学习)

[09:11] 🧠 Language Surgery in Multilingual Large Language Models(多语言大型语言模型中的语言手术)

[09:44] 🤖 AI Agent Behavioral Science(人工智能体行为科学)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

무삭제판 에피소드를 청취하려면 로그인하십시오.

이 프로그램의 최신 정보 받기

프로그램을 팔로우하고, 에피소드를 저장하고, 최신 소식을 받아보려면 로그인하거나 가입하십시오.

국가 또는 지역 선택

아프리카, 중동 및 인도

아시아 태평양

유럽

라틴 아메리카 및 카리브해

미국 및 캐나다