2025.07.08 | MemOS提升内存管理效率;MLM与CLM结合优化编码器训练。

HuggingFace 每日AI论文速递

本期的 15 篇论文如下:

[00:21] 🧠 MemOS: A Memory OS for AI System(MemOS:面向人工智能系统的内存操作系统)

[01:07] 🤔 Should We Still Pretrain Encoders with Masked Language Modeling?(我们是否还应该使用掩码语言模型预训练编码器?)

[01:43] 🎥 4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture(4DSloMo:基于异步捕获的高速场景4D重建)

[02:22] 🤖 DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge(DreamVLA:一个基于综合世界知识构想的视觉-语言-动作模型)

[03:02] 🤖 Pre-Trained Policy Discriminators are General Reward Models(预训练策略判别器是通用奖励模型)

[03:38] 🧠 BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset(BMMR:一个大规模双语多模态多学科推理数据集)

[04:23] 🤖 RoboBrain 2.0 Technical Report(RoboBrain 2.0 技术报告)

[05:04] 🧩 Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents(Easy Dataset:一个从非结构化文档中合成LLM微调数据的统一且可扩展的框架)

[05:42] ✨ RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs(RefineX:通过专家指导的程序学习大规模优化预训练数据)

[06:21] 🎬 StreamDiT: Real-Time Streaming Text-to-Video Generation(StreamDiT:实时流式文本到视频生成)

[07:04] 📜 Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration(复兴文化遗产:一种全面的历史文献修复新方法)

[07:49] 💡 OmniDraft: A Cross-vocabulary, Online Adaptive Drafter for On-device Speculative Decoding(OmniDraft:一种用于端侧推测解码的跨词汇、在线自适应 Drafter)

[08:35] 🎨 ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation(ArtifactsBench:弥合LLM代码生成评估中的视觉交互鸿沟)

[09:16] 📊 On the rankability of visual embeddings(论视觉嵌入的可排序性)

[09:59] 🖼 VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents(VLM2Vec-V2:推进视频、图像和视觉文档的多模态嵌入)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

무삭제판 에피소드를 청취하려면 로그인하십시오.

이 프로그램의 최신 정보 받기

프로그램을 팔로우하고, 에피소드를 저장하고, 최신 소식을 받아보려면 로그인하거나 가입하십시오.

국가 또는 지역 선택

아프리카, 중동 및 인도

아시아 태평양

유럽

라틴 아메리카 및 카리브해

미국 및 캐나다