HuggingFace 每日AI论文速递

2025.07.30 | 混元世界从文字像素生成沉浸3D世界;X-Omni用强化学习提升图像生成质量。

本期的 8 篇论文如下:

[00:23] 🌍 HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels(混元世界 1.0:从文字或像素生成沉浸式、可探索、可交互的3D世界)

[00:56] ✨ X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again(X-Omni:强化学习让离散自回归图像生成模型再展辉煌)

[01:59] 🚀 CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning(CUDA-L1:通过对比强化学习改进CUDA优化)

[02:43] ✨ MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge(MaPPO:结合先验知识的最大后验偏好优化)

[03:32] 🐾 AnimalClue: Recognizing Animals by their Traces(AnimalClue:通过痕迹识别动物)

[04:04] 🏃 MOVE: Motion-Guided Few-Shot Video Object Segmentation(MOVE:运动引导的少样本视频目标分割)

[04:31] 🤥 MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions(MoHoBench:通过无法回答的视觉问题评估多模态大语言模型的诚实性)

[04:59] 🐘 Evaluating Deep Learning Models for African Wildlife Image Classification: From DenseNet to Vision Transformers(评估用于非洲野生动物图像分类的深度学习模型:从DenseNet到视觉Transformer)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递