Hugging Face Trending Papers

Code Coin Cognition LLC

0.0 (0)
TECNOLOGÍA
CADA DÍA

Stay ahead in AI with Hugging Face Trending Papers — your daily digest of trending ai research. Hosts break down the most talked-about papers in machine learning, LLMs, generative AI, and robotics in just few minutes. Clear, conversational insights on problems, methods, benchmarks, and real-world impact — no jargon overload. Perfect for researchers, engineers, students, and AI enthusiasts.

HACE 2 DÍAS

Episode. 15: Real-Time AI: Video, Proactive LLMs & Text Structure

This episode explores groundbreaking AI research, featuring Helios, a real-time long video generation model; Proact-VL, a proactive VideoLLM for real-time AI companions; and T2S-Bench & Structure-of-Thought, a new benchmark and prompting technique for text-to-structure reasoning. ### Featured Papers* **Helios: Real Real-Time Long Video Generation Model** * **Key Insight:** Helios is the first 14B video generation model capable of real-time (19.5 FPS) minute-scale video generation on a single H100 GPU, achieving high quality by addressing long-video drifting and optimizing for efficiency. * **Paper Link:** [https://arxiv.org/pdf/2603.04379.pdf](https://arxiv.org/pdf/2603.04379.pdf)* **Proact-VL: A Proactive VideoLLM for Real-Time AI Companions** * **Key Insight:** Proact-VL introduces a framework for creating proactive, real-time interactive AI companions, particularly for gaming scenarios like commentators and guides, by enabling low-latency inference and autonomous decision-making. * **Paper Link:** [https://arxiv.org/pdf/2603.03447.pdf](https://arxiv.org/pdf/2603.03447.pdf)* **T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning** * **Key Insight:** This work introduces Structure-of-Thought, a prompting technique that guides models to construct intermediate text structures, and T2S-Bench, the first benchmark designed to evaluate and improve models' text-to-structure reasoning capabilities. * **Paper Link:** [https://arxiv.org/pdf/2603.03790.pdf](https://arxiv.org/pdf/2603.03790.pdf)

10 min
HACE 2 DÍAS

Episode 14: Revolutionizing Deep Learning: The Rise of CUDA Agent and Agentic RL

# Hugging Face Trending Papers Episode SummaryIn this episode, we discuss two trending papers, "Large-Scale Agentic RL for High-Performance CUDA Kernel Generation" and "Language-Agnostic SWE Task Collection at Scale". The first paper presents CUDA Agent, a large-scale reinforcement learning system that optimizes GPUs for deep learning, and the second introduces SWE-rebench V2, a language-agnostic, automated pipeline for collecting real-world software engineering tasks for training software engineering agents. ## Papers Discussed- "Large-Scale Agentic RL for High-Performance CUDA Kernel Generation" introduces CUDA Agent, a system that fundamentally improves GPU optimization ability for deep learning using scalable data synthesis, skill-augmented CUDA development, and reinforcement learning techniques. The system achieves state-of-the-art results on KernelBench. [Read the paper](https://arxiv.org/pdf/2602.24286) - "Language-Agnostic SWE Task Collection at Scale" presents SWE-rebench V2, an automated pipeline for collecting real-world software engineering tasks and constructing reinforcement learning training environments at scale. The pipeline has constructed a dataset of 32,000+ tasks spanning 20 languages and 3,600+ repositories. [Read the paper](https://arxiv.org/pdf/2602.23866) ## Additional Links- Project page for CUDA Agent: [https://cuda-agent.github.io/](https://cuda-agent.github.io/)Remember to follow or subscribe for the latest in AI research, and stay curious!

4 min
21/11/2025

Episode 13: Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation **Source:** huggingface_daily **URL:** https://huggingface.co/papers/2511.14993 **Key Points:**- Problem: The research addresses the challenges in high-resolution image and video generation, particularly the scalability and computational complexity associa...- Method: The authors introduce Kandinsky 5.0, a family of foundation models comprising three core variants: Kandinsky 5.0 Image Lite, Kandinsky 5.0 Video Lite,...- Results: Kandinsky 5.0 achieves state-of-the-art performance in high-resolution image and 10-second video synthesis, demonstrating superior generation quality ...- Implications: Kandinsky 5.0 has significant implications for the research community by providing an open-source framework that advances the accessibility and develo...

2 min
19/11/2025

Episode 12: Exploring Next-Gen AI: Interactive Scaling & Video-Based Reasoning

# Episode SummaryIn this episode of Hugging Face Trending Papers, we delve into the latest AI research with three top trending papers from arXiv. We explore MiroThinker's interaction scaling for open-source research agents, the new paradigm of "Thinking with Video" for multimodal reasoning, and Lumine's approach to building generalist AI agents for 3D open-world environments. # Mentioned Papers1. ["MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling"](https://arxiv.org/pdf/2511.11793) - This paper presents MiroThinker, an open-source research agent that improves tool-augmented reasoning and information-seeking capabilities by focusing on efficient interaction scaling. 2. ["Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm"](https://arxiv.org/pdf/2511.04570) - The authors propose "Thinking with Video," a new paradigm that uses video generation models to bridge visual and textual reasoning, overcoming limitations of current "Thinking with Text" and "Thinking with Images" paradigms. 3. ["Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds"](https://arxiv.org/pdf/2511.08892) - Lumine introduces a recipe for developing AI agents capable of completing complex missions in 3D open-world environments, demonstrating strong zero-shot cross-game generalization.

4 min
02/11/2025

Episode 11: Unlocking AI Reasoning: Breakthroughs in Looped Language Models

Papers discussed: 1. [Scaling Latent Reasoning via Looped Language Models](https://arxiv.org/pdf/2510.25741): This paper introduces a new kind of pre-trained looped language models, Ouro, which improves reasoning capabilities by integrating reasoning into the pre-training phase. The models have demonstrated superior performance due to enhanced knowledge manipulation capabilities. 2. [Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations](https://arxiv.org/pdf/2510.23607): The Concerto model combines 2D and 3D learning for improved spatial cognition in AI. This integration, involving 3D intra-modal self-distillation with 2D-3D cross-modal joint embedding, has yielded promising results in 3D scene perception and set new benchmarks in scene understanding. 3. [RECODE: Unify Plan and Action for Universal Granularity Control](https://arxiv.org/pdf/2510.23564): RECODE is a new paradigm that unifies planning and action within a single code representation, facilitating dynamic control of decision granularity. This approach has proven effective in enhancing inference performance and training data efficiency.

5 min
22/10/2025

Episode 10: AI's New Brain: LLM Reasoning, Memory, Agents

**Episode Summary:**This episode dives into cutting-edge advancements for Large Language Models, covering new methods to enhance reasoning reliability and efficiency, and introducing lightweight memory systems for more effective long-term interaction. **Featured Papers:*** **A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning** * *Key Insight:* Introduces RPC, a novel method that theoretically and empirically improves LLM reasoning by combining self-consistency and perplexity, achieving exponential error convergence and reducing sampling costs by 50%. * *Link:* https://arxiv.org/pdf/2510.15444 * **LIGHTMEM: LIGHTWEIGHT AND EFFICIENT MEMORY-AUGMENTED GENERATION** * *Key Insight:* Presents LightMem, a human-memory-inspired system that enables LLMs to leverage historical interactions efficiently, significantly reducing token usage, API calls, and runtime while boosting accuracy. * *Link:* https://arxiv.org/pdf/2510.18866 * **DeepAnalyze: Agentic Large Language Models for Autonomous Data Science** * *Key Insight:* Introduces an agentic LLM framework for autonomous data science, automating the entire process from raw data to analyst-graded research reports using multi-agent collaboration and feedback reasoning. * *Link:* https://arxiv.org/pdf/2510.16872

4 min
10/10/2025

Episode 9: Boosting AI Problem Solving: Tiny Networks and Early Experience Learning

In this episode of Hugging Face Trending Papers, we discuss three exciting AI research papers: "Less is More: Recursive Reasoning with Tiny Networks", "Agent Learning via Early Experience", and "Paper2Video: Automatic Video Generation from Scientific Papers". ## Papers Discussed1. **[Less is More: Recursive Reasoning with Tiny Networks](https://arxiv.org/pdf/2510.04871)**: This paper introduces a Tiny Recursive Model that significantly improves accuracy on hard question-answer problems, using a simpler recursive reasoning approach and beating Large Language Models on complex tasks. 2. **[Agent Learning via Early Experience](https://arxiv.org/pdf/2510.08558)**: This research paper presents a new paradigm called "early experience", where AI agents learn from their own actions. The approach improved effectiveness and out-of-domain generalization in diverse environments.3. **[Paper2Video: Automatic Video Generation from Scientific Papers](https://arxiv.org/pdf/2510.05096)**: This paper presents Paper2Video, a multi-agent framework designed to automate the labor-intensive process of generating academic presentation videos from scientific papers. ## Episode Links- [Paper 1: Less is More: Recursive Reasoning with Tiny Networks](https://arxiv.org/pdf/2510.04871)- [Paper 2: Agent Learning via Early Experience](https://arxiv.org/pdf/2510.08558)- [Paper 3: Paper2Video: Automatic Video Generation from Scientific Papers](https://arxiv.org/pdf/2510.05096)

5 min
03/10/2025

Episode 8: Boosting AI Efficiency: Code Compression, Video Generation, and Experience-based Reasoning

In this episode, we discuss three trending AI research papers. We delve into the challenges and solutions related to code language models, video generation, and reinforcement learning. Key Points Discussed#LongCodeZip: Compress Long Context for Code Language Models- LongCodeZip is a novel framework for compressing code for Large Language Models (LLMs)- It addresses the issue of high API costs and generation latency associated with processing long inputs in codebases- The framework uses a dual-stage compression strategy, enabling it to preserve essential information while reducing context size- Evaluations show that LongCodeZip consistently outperforms baseline methods- This research could improve the efficiency and capability of code intelligence applications #Self-Forcing++: Towards Minute-Scale High-Quality Video Generation- The paper addresses the computational cost of generating long videos with diffusion models- It proposes an approach that uses teacher models to guide student models through sampled segments from self-generated long videos- This method allows for video length scaling up to 20× beyond the teacher's capability- The authors manage to generate videos up to 4 minutes and 15 seconds long, substantially outperforming baseline methods #EXGRPO: Learning to Reason from Experience- The paper investigates what makes a reasoning experience valuable in the context of Reinforcement Learning from Verifiable Rewards (RLVR)- The authors propose a framework that organizes and prioritizes valuable experiences- The approach aims to balance exploration with experience exploitation for efficient and scalable RLVR ### Links to Papers- [ LongCodeZip: Compress Long Context for Code Language Models](https://arxiv.org/pdf/2510.00446 )- [ Self-Forcing++: Towards Minute-Scale High-Quality Video Generation](https://arxiv.org/pdf/2510.02283 )- [EXGRPO: Learning to Reason from Experience](https://arxiv.org/pdf/2510.02245 )

4 min

Ver todo (15)

Creador

Code Coin Cognition LLC
Años de actividad

2025 - 2026
Episodios

15
Clasificación

Apto
Mostrar sitio web

Hugging Face Trending Papers

Hugging Face Trending Papers

Episode. 15: Real-Time AI: Video, Proactive LLMs & Text Structure

Episode 14: Revolutionizing Deep Learning: The Rise of CUDA Agent and Agentic RL

Episode 13: Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Episode 12: Exploring Next-Gen AI: Interactive Scaling & Video-Based Reasoning

Episode 11: Unlocking AI Reasoning: Breakthroughs in Looped Language Models

Episode 10: AI's New Brain: LLM Reasoning, Memory, Agents

Episode 9: Boosting AI Problem Solving: Tiny Networks and Early Experience Learning

Episode 8: Boosting AI Efficiency: Code Compression, Video Generation, and Experience-based Reasoning

Acerca de

Información

Hugging Face Trending Papers

Episodios

Episode. 15: Real-Time AI: Video, Proactive LLMs & Text Structure

Episode 14: Revolutionizing Deep Learning: The Rise of CUDA Agent and Agentic RL

Episode 13: Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Episode 12: Exploring Next-Gen AI: Interactive Scaling & Video-Based Reasoning

Episode 11: Unlocking AI Reasoning: Breakthroughs in Looped Language Models

Episode 10: AI's New Brain: LLM Reasoning, Memory, Agents

Episode 9: Boosting AI Problem Solving: Tiny Networks and Early Experience Learning

Episode 8: Boosting AI Efficiency: Code Compression, Video Generation, and Experience-based Reasoning

Acerca de

Información