Daily Paper Cast

Jingwen Liang, Gengyu Wang

0.0 (0)
SCIENCE
UPDATED DAILY

We update every weekday to discuss highest-voted papers from Huggingface Daily Paper (https://huggingface.co/papers). Both the podcast scripts and audio are generated by AI. Feedback and suggestions are welcome! Email us: dailypapercast.ai@gmail.com Creator: Jingwen Liang, 3D ML, https://www.linkedin.com/in/jingwen-liang/ Gengyu Wang, LLM ML, http://wanggengyu.com Listen on: Spotify: https://open.spotify.com/show/21nrhmdaA8qoBiH8q03NXL Apple Podcast: https://podcasts.apple.com/us/podcast/daily-paper-cast/id1777620236 Cover Image by Kawen Kuang https://kawen.art

See All (1.8k)

Creator

Jingwen Liang, Gengyu Wang
Years Active

2024 - 2026
Episodes

1.8k
Rating

Clean
Show Website

Daily Paper Cast

Daily Paper Cast

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

Video Analysis and Generation via a Semantic Progress Function

About

Information

Daily Paper Cast

Episodes

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

Video Analysis and Generation via a Semantic Progress Function

About

Information