本期的 14 篇论文如下:
[00:24] 🌍 OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling(OmniWorld:面向4D世界建模的多领域多模态大规模数据集)
[01:12] 🤖 UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning(UI-S1:基于半在线强化学习的图形界面自动化新进展)
[01:51] 🏠 InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts(InternScenes:具备真实布局的大规模可模拟室内场景数据集)
[02:27] 🖱 LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence(LazyDrag:通过显式对应关系在多模态扩散Transformer上实现稳定拖拽编辑)
[02:58] 📊 Locality in Image Diffusion Models Emerges from Data Statistics(图像扩散模型中的局部性源于数据统计特性)
[03:29] 🤔 Measuring Epistemic Humility in Multimodal Large Language Models(多模态大模型中的认知谦逊评估研究)
[03:57] 🤖 Nav-R1: Reasoning and Navigation in Embodied Scenes(Nav-R1:具身场景中的推理与导航)
[04:25] 🔍 Lost in Embeddings: Information Loss in Vision-Language Models(迷失在嵌入空间:视觉-语言模型中的信息损失)
[04:54] 🌐 CognitiveSky: Scalable Sentiment and Narrative Analysis for Decentralized Social Media(CognitiveSky:面向去中心化社交媒体的情感与叙事可扩展分析框架)
[05:19] 🔍 Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models(再审视,慢思考:增强视觉语言模型的视觉反思能力)
[05:57] 🧠 EthicsMH: A Pilot Benchmark for Ethical Reasoning in Mental Health AI(心理健康AI伦理推理的试验基准:EthicsMH)
[06:30] ⚖ Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting(通过动态奖励加权实现多目标对齐优化学习)
[07:16] 🧠 PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits(PersonaX:基于大语言模型推断行为特质的多模态数据集)
[07:52] 🔍 GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings(GAPrune:面向领域感知嵌入的梯度对齐剪枝方法)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
Information
- Show
- FrequencyUpdated daily
- Published16 September 2025 at 23:00 UTC
- Length9 min
- RatingClean