HuggingFace 每日AI论文速递

2025.08.20 | 智能体链提升效率;长视频3D重建优化

本期的 15 篇论文如下:

[00:23] 🤖 Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL(智能体链:基于多智能体蒸馏与智能体强化学习的端到端智能体基础模型)

[00:52] 🎥 LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos(LongSplat:针对随意长视频的鲁棒无姿态3D高斯泼溅)

[01:13] 🛠 Prompt Orchestration Markup Language(提示编排标记语言)

[01:33] 🎨 MultiRef: Controllable Image Generation with Multiple Visual References(MultiRef:多视觉参考可控图像生成)

[02:00] 🤖 Evaluating Podcast Recommendations with Profile-Aware LLM-as-a-Judge(基于用户画像感知的LLM评判播客推荐效果评估)

[02:29] 🦾 Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation(Embodied-R1:强化具身推理实现通用机器人操作)

[02:59] ✅ Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation(关注生成过程:LLM生成时的细粒度置信度估计)

[03:22] 🎨 Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer(基于多模态扩散Transformer的免训练文本引导颜色编辑)

[03:45] 🪄 OmniTry: Virtual Try-On Anything without Masks(OmniTry:无需掩膜的万物虚拟试穿)

[04:08] ⏰ A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models(防患未然:语言模型的主动式自我精炼)

[04:32] 👂 Advances in Speech Separation: Techniques, Challenges, and Future Trends(语音分离的进展:技术、挑战与未来趋势)

[05:04] 😥 Leveraging Large Language Models for Predictive Analysis of Human Misery(利用大型语言模型对人类痛苦进行预测性分析)

[05:27] ⏳ TempFlow-GRPO: When Timing Matters for GRPO in Flow Models(TempFlow-GRPO:时序性在流模型GRPO中的关键作用)

[05:58] 🗺 CAMAR: Continuous Actions Multi-Agent Routing(CAMAR:连续动作多智能体路径规划)

[06:25] 🔒 Copyright Protection for Large Language Models: A Survey of Methods, Challenges, and Trends(大型语言模型版权保护:方法、挑战与趋势综述)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递