6 AUG
7 MIN

2025.08.05 | 图像文本渲染编辑创新；上下文检索提升故事理解

本期的 15 篇论文如下：

[00:18] 🎨 Qwen-Image Technical Report（Qwen-Image技术报告）

[00:39] 🔍 SitEmb-v1.5: Improved Context-Aware Dense Retrieval for Semantic Association and Long Story Comprehension（SitEmb-v1.5：改进的上下文感知密集检索用于语义关联与长故事理解）

[01:08] 🧬 CellForge: Agentic Design of Virtual Cell Models（CellForge: 虚拟细胞模型的智能体设计）

[01:36] 🧠 Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following（超越权衡：用于推理模型指令遵循的自监督强化学习）

[02:05] 🛡 Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report（Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct 技术报告）

[02:32] 🤖 InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation（InstructVLA：从理解到操作的视觉-语言-动作指令微调）

[03:04] 🚀 VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo（VeOmni：通过以模型为中心的分布式配方库扩展任意模态模型训练）

[03:31] ✂ A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models（压缩一瞥：大型视觉语言模型的动态视觉令牌剪枝）

[03:57] 🔒 Personalized Safety Alignment for Text-to-Image Diffusion Models（文本到图像扩散模型的个性化安全对齐）

[04:16] 🌐 Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe（Voxlect：一个用于建模全球方言和地区语言的语音基础模型基准）

[04:46] 🧠 RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong Learning in Physical Embodied Systems（RoboMemory：一种受大脑启发的多记忆智能体框架，用于物理体现系统中的终身学习）

[05:10] 🎨 Artificial Intelligence and Misinformation in Art: Can Vision Language Models Judge the Hand or the Machine Behind the Canvas?（人工智能与艺术中的错误信息：视觉语言模型能否判断画布背后的是人手还是机器？）

[05:47] 🔄 Exploitation Is All You Need... for Exploration（利用是你所需要的一切...为了探索）

[06:15] 🔒 Cyber-Zero: Training Cybersecurity Agents without Runtime（Cyber-Zero：无运行时训练网络安全代理）

[06:41] 🧠 AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks（AgentTTS：用于复杂任务中测试时计算最优扩展策略的大语言模型智能体）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

Episode Webpage

Show

HuggingFace 每日AI论文速递
Frequency

Updated daily
Published

6 August 2025 at 00:00 UTC
Length

7 min
Rating

Clean

2025.08.05 | 图像文本渲染编辑创新；上下文检索提升故事理解

Information