本期的 15 篇论文如下:
[00:22] 🌍 Sekai: A Video Dataset towards World Exploration(Sekai:一个面向世界探索的视频数据集)
[01:02] 💡 ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs(原型推理:作为大型语言模型中通用推理基础的原型)
[01:43] 💡 GenRecal: Generation after Recalibration from Large to Small Vision-Language Models(GenRecal:从大型到小型视觉-语言模型的重校准后生成)
[02:24] 🗣 BUT System for the MLC-SLM Challenge(用于MLC-SLM挑战赛的BUT系统)
[03:10] 🤖 Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence(具身Web智能体:连接物理与数字领域,实现集成智能)
[03:57] 💡 Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation(自由形式生成中基于语义感知的开放式R1训练奖励)
[04:43] 🔬 SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification(SciVer:评估多模态科学声明验证中的基础模型)
[05:26] 🚀 Truncated Proximal Policy Optimization(截断近端策略优化)
[06:04] 🖼 PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers(PictSure:预训练嵌入对上下文学习图像分类器的影响)
[06:37] 🖼 CoMemo: LVLMs Need Image Context with Image Memory(CoMemo:LVLM需要带有图像记忆的图像上下文)
[07:21] 🤖 SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence(群体智能代理:迈向基于群体智能的全自动代理系统生成)
[08:01] 🧠 MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models(MoTE:面向内存高效的大型多模态模型的三元专家混合)
[08:45] 🛡 OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents(OS-Harm:衡量计算机使用Agent安全性的基准)
[09:34] 🏞 ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies(ImmerseGen:基于代理引导的、使用Alpha纹理代理的沉浸式世界生成)
[10:09] 🤝 FedNano: Toward Lightweight Federated Tuning for Pretrained Multimodal Large Language Models(FedNano:面向预训练多模态大语言模型的轻量级联邦调优)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
Information
- Show
- FrequencyUpdated daily
- Published19 June 2025 at 23:00 UTC
- Length11 min
- RatingClean