【周末特辑】6月第4周最火AI论文 | 高效扩展推理能力;多模态金融评估基准。

HuggingFace 每日AI论文速递

本期的 5 篇论文如下:

[00:36] TOP1(🔥216) | 💡 MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention(MiniMax-M1:利用闪电注意力高效扩展测试时计算)

[02:44] TOP2(🔥82) | 📊 MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation(MultiFinBen:一个多语言、多模态和难度感知的金融领域大语言模型评估基准)

[05:32] TOP3(🔥64) | 🔬 Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning(科学家的首次考试:通过感知、理解和推理来探索多模态大型语言模型的认知能力)

[07:53] TOP4(🔥53) | 🧐 DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents(DeepResearch Bench:一个面向深度研究Agent的综合性评测基准)

[09:39] TOP5(🔥52) | 🤖 Scaling Test-time Compute for LLM Agents(扩展LLM Agent的测试时计算)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign in or sign up to follow shows, save episodes, and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada