本期的 15 篇论文如下:
[00:21] 🧠 MemOS: A Memory OS for AI System(MemOS:面向人工智能系统的内存操作系统)
[01:07] 🤔 Should We Still Pretrain Encoders with Masked Language Modeling?(我们是否还应该使用掩码语言模型预训练编码器?)
[01:43] 🎥 4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture(4DSloMo:基于异步捕获的高速场景4D重建)
[02:22] 🤖 DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge(DreamVLA:一个基于综合世界知识构想的视觉-语言-动作模型)
[03:02] 🤖 Pre-Trained Policy Discriminators are General Reward Models(预训练策略判别器是通用奖励模型)
[03:38] 🧠 BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset(BMMR:一个大规模双语多模态多学科推理数据集)
[04:23] 🤖 RoboBrain 2.0 Technical Report(RoboBrain 2.0 技术报告)
[05:04] 🧩 Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents(Easy Dataset:一个从非结构化文档中合成LLM微调数据的统一且可扩展的框架)
[05:42] ✨ RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs(RefineX:通过专家指导的程序学习大规模优化预训练数据)
[06:21] 🎬 StreamDiT: Real-Time Streaming Text-to-Video Generation(StreamDiT:实时流式文本到视频生成)
[07:04] 📜 Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration(复兴文化遗产:一种全面的历史文献修复新方法)
[07:49] 💡 OmniDraft: A Cross-vocabulary, Online Adaptive Drafter for On-device Speculative Decoding(OmniDraft:一种用于端侧推测解码的跨词汇、在线自适应 Drafter)
[08:35] 🎨 ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation(ArtifactsBench:弥合LLM代码生成评估中的视觉交互鸿沟)
[09:16] 📊 On the rankability of visual embeddings(论视觉嵌入的可排序性)
[09:59] 🖼 VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual Documents(VLM2Vec-V2:推进视频、图像和视觉文档的多模态嵌入)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
정보
- 프로그램
- 주기매일 업데이트
- 발행일2025년 7월 8일 오후 11:00 UTC
- 길이11분
- 등급전체 연령 사용가