HuggingFace 每日AI论文速递

2025.09.16 | OmniWorld建4D数据底座;UI-S1半在线驯界面代理

本期的 14 篇论文如下:

[00:24] 🌍 OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling(OmniWorld:面向4D世界建模的多领域多模态大规模数据集)

[01:12] 🤖 UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning(UI-S1:基于半在线强化学习的图形界面自动化新进展)

[01:51] 🏠 InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts(InternScenes:具备真实布局的大规模可模拟室内场景数据集)

[02:27] 🖱 LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence(LazyDrag:通过显式对应关系在多模态扩散Transformer上实现稳定拖拽编辑)

[02:58] 📊 Locality in Image Diffusion Models Emerges from Data Statistics(图像扩散模型中的局部性源于数据统计特性)

[03:29] 🤔 Measuring Epistemic Humility in Multimodal Large Language Models(多模态大模型中的认知谦逊评估研究)

[03:57] 🤖 Nav-R1: Reasoning and Navigation in Embodied Scenes(Nav-R1:具身场景中的推理与导航)

[04:25] 🔍 Lost in Embeddings: Information Loss in Vision-Language Models(迷失在嵌入空间:视觉-语言模型中的信息损失)

[04:54] 🌐 CognitiveSky: Scalable Sentiment and Narrative Analysis for Decentralized Social Media(CognitiveSky:面向去中心化社交媒体的情感与叙事可扩展分析框架)

[05:19] 🔍 Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models(再审视,慢思考:增强视觉语言模型的视觉反思能力)

[05:57] 🧠 EthicsMH: A Pilot Benchmark for Ethical Reasoning in Mental Health AI(心理健康AI伦理推理的试验基准:EthicsMH)

[06:30] ⚖ Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting(通过动态奖励加权实现多目标对齐优化学习)

[07:16] 🧠 PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits(PersonaX:基于大语言模型推断行为特质的多模态数据集)

[07:52] 🔍 GAPrune: Gradient-Alignment Pruning for Domain-Aware Embeddings(GAPrune:面向领域感知嵌入的梯度对齐剪枝方法)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递