AI: post transformers

mcgrof

The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.

  1. Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

    قبل ٣ أيام

    Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

    The 2021 Google Research, Brain Team paper "Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning" introduces Policy Similarity Embeddings (PSEs), a novel framework designed to help reinforcement learning (RL) agents apply their skills to unfamiliar tasks. Traditional methods often struggle with **generalization**, failing when minor visual changes occur in semantically identical environments. To fix this, the researchers developed the **Policy Similarity Metric (PSM)**, which identifies states as equivalent if they require the same **optimal actions** both now and in the future. By using **contrastive metric embeddings**, the system trains neural networks to group these behaviorally similar states together in a shared representation space. Experimental results on **jumping tasks** and complex control suites demonstrate that this approach significantly outperforms standard **data augmentation** and regularization techniques. Ultimately, the work proves that focusing on **sequential behavioral patterns** rather than just visual data allows agents to adapt much more effectively to new challenges. Source: September 29 2021 Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning Google Research, Brain Team Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare https://arxiv.org/pdf/2101.05265

    ١٣ من الدقائق
  2. Experiential Reinforcement Learning: Internalizing Reflection for Better Policy Training

    قبل ٣ أيام

    Experiential Reinforcement Learning: Internalizing Reflection for Better Policy Training

    The research published on February 15, 2026 in a joint collaboration between University of Southern California, Microsoft and University of Pennsylvania introduces **Experiential Reinforcement Learning (ERL)**, a novel training framework designed to help language models learn from their own interactions more effectively than standard reinforcement learning. Unlike traditional methods that rely solely on numerical rewards, ERL enables agents to **verbally reflect** on their failures and successes within each training episode. This process involves a **cycle of experience, reflection, and consolidation**, where the model uses a cross-episode memory to store effective corrective patterns. To ensure these improvements persist without needing reflection during actual use, the system utilizes **selective distillation** to internalize successful behaviors directly into the base policy. Experimental results across **agentic reasoning tasks** like Sokoban and FrozenLake show that ERL significantly boosts learning efficiency and final performance. Ultimately, the framework demonstrates that **structured self-critique** transforms sparse environment feedback into durable, high-quality behavioral changes. Source: February 2026 Experiential Reinforcement Learning University of Southern California, Microsoft, University of Pennsylvania Taiwei Shi, Sihao Chen, Bowen Jiang, Linxin Song, Longqi Yang, Jieyu Zhao https://arxiv.org/pdf/2602.13949

    ١٤ من الدقائق
  3. Agentic Plan Caching: Fast and Cost-Efficient LLM Memory

    قبل ٥ أيام

    Agentic Plan Caching: Fast and Cost-Efficient LLM Memory

    Agentic Plan Caching (APC), described in the paper published by Stanford researchers on January 26, 2026, lets AI agents reuse structured plan templates from prior executions instead of re-invoking expensive LLMs for every new task. It achieved 76% cost reduction on benchmarks. Using different sources we create projections for growth using a simple growth model: Plans/year = ActiveAgents x PlansPerDay x 365, and Storage = Plans x BytesPerPlan MarketsandMarkets forecasts the AI agent market growing from $7.8B to $52.6B by 2030 at 46% CAGR. IDC projects 1.3 billion deployed AI agents by 2028. Gartner says 33% of enterprise software will be agentic by 2028, up from under 1% in 2024, with 15% of daily work decisions made autonomously. These three forecasts together imply that agent-driven plan generation will scale explosively — at just 5-20 plans per agent per day, 1.3 billion agents means 2.4 to 9.5 trillion plans per year across the ecosystem by 2028. We evaluate a possible storage offloading tipping point using all this data. Raw plan text is cheap at 2-10 KB each, but production systems also store retrieval embeddings, keyword indexes, tool call traces, and trajectory logs — inflating effective bytes per plan by 10-100x. That is the silent killer. Under conservative assumptions (1M agents, 30% YoY growth, lean plans), everything fits in RAM for years. Under aggressive assumptions (100M agents, 80% YoY, rich metadata), SSD offload becomes structurally inevitable in year one — you simply cannot fit petabytes of cached plans in RAM. The paradox is that APC's own success makes hoarding worse: every cached plan that saves a 50-cent LLM call is a plan you never want to delete. The better caching works, the faster storage pressure grows, and NVMe/SSD tiers stop being optional and start being load-bearing infrastructure. RAM is not a trash can with a power button. Sources: 1) October 2025 AI Agents: Technologies, Applications and Global Markets BCC Research Austin Samuel https://www.bccresearch.com/market-research/artificial-intelligence-technology/ai-agent-market.html 2) March 26 2025 (Updated November 6 2025) 26 AI Agent Statistics (Adoption Trends and Business Impact) Datagrid Datagrid Team 3) January 26 2026 Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents Stanford University Qizheng Zhang, Michael Wornow, Gerry Wan, Kunle Olukotun https://arxiv.org/abs/2506.14852 4) 2026 AI Agents Market Size And Share | Industry Report, 2033 Grand View Research 5) April 2025 AI Agents Market Size, Share & Trends | Growth Analysis, Forecast MarketsandMarkets 6) August 26 2025 (Updated September 5 2025) Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in 2025 Gartner 7) June 25 2025 Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027 Gartner 8) May 30 2025 Getting to one billion agents Perspectives on Power Platform Jukka Niiranen 9) May 20 2025 Microsoft expects 1.3 billion AI agents to be in operation by 2028 – here’s how it plans to get them working together IT Pro Bobby Hellard 10) 2026 Top Strategic Technology Trends for 2026 Gartner Gene Alvarez, Tori Paulman 11) October 28 2025 What 1.3 billion AI Agents by 2028 Means for Business Leaders Lantern

    ١٢ من الدقائق

حول

The transformer architecture revolutionized the world of Neural Networks. It was a springboard for what we know today as modern artificial intelligence. This podcast focuses on modern state of the art research paper reviews starting from the transformer and on.

قد يعجبك أيضًا