Daily Paper Cast

Jingwen Liang, Gengyu Wang

0,0 (0)
SCIENCES
TOUS LES JOURS

We update every weekday to discuss highest-voted papers from Huggingface Daily Paper (https://huggingface.co/papers). Both the podcast scripts and audio are generated by AI. Feedback and suggestions are welcome! Email us: dailypapercast.ai@gmail.com Creator: Jingwen Liang, 3D ML, https://www.linkedin.com/in/jingwen-liang/ Gengyu Wang, LLM ML, http://wanggengyu.com Listen on: Spotify: https://open.spotify.com/show/21nrhmdaA8qoBiH8q03NXL Apple Podcast: https://podcasts.apple.com/us/podcast/daily-paper-cast/id1777620236 Cover Image by Kawen Kuang https://kawen.art

Tout afficher (1,4 k)

Création

Jingwen Liang, Gengyu Wang
Années d’activité

2024 - 2025
Épisodes

1,4 k
Classification

Tous publics
Site web de l’émission

Daily Paper Cast

Daily Paper Cast

Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph

The Underappreciated Power of Vision Models for Graph Structural Understanding

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

À propos

Informations

Daily Paper Cast

Épisodes

Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

Generalizing Test-time Compute-optimal Scaling as an Optimizable Graph

The Underappreciated Power of Vision Models for Graph Structural Understanding

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

À propos

Informations