AI Post Transformers

mcgrof

AI-generated podcast where hosts Hal Turing and Dr. Ada Shannon discuss the latest research papers and reports in machine learning, AI systems, and optimization. Featuring honest critical analysis, proper citations, and nerdy humor.

  1. 1d ago

    Can LLMs Enable Mainstream Formal Verification?

    This episode explores whether large language models can help mainstream developers write code that is not just plausible, but formally verified in systems like Dafny, Nagini, and Verus. It explains the core ideas behind machine-checked correctness, including contracts, SMT solvers, loop invariants, and why verification is far stricter than passing tests or matching statistical behavior. The discussion highlights the paper’s main argument that the real bottleneck is not only generating implementations, but also generating the formal scaffolding and annotations that proofs require, especially around loops. Listeners get a clear view of how the authors evaluate LLMs with verifier feedback and extra validation to catch weakened specifications, making the episode interesting for anyone curious about whether AI can close the trust gap in code generation. Sources: 1. Can LLMs Enable Verification in Mainstream Programming? — Aleksandr Shefer, Igor Engel, Stanislav Alekseev, Daniil Berezun, Ekaterina Verbitskaia, Anton Podkopaev, 2025 http://arxiv.org/abs/2503.14183 2. Inferring Loop Invariants using Postconditions — Carlo A. Furia, Bertrand Meyer, 2009 https://scholar.google.com/scholar?q=Inferring+Loop+Invariants+using+Postconditions 3. Inferring Loop Invariants by Mutation, Dynamic Analysis, and Static Checking — Juan P. Galeotti, Carlo A. Furia, Eva May, Gordon Fraser, Andreas Zeller, 2014 https://scholar.google.com/scholar?q=Inferring+Loop+Invariants+by+Mutation,+Dynamic+Analysis,+and+Static+Checking 4. LoopInvGen: A Loop Invariant Generator based on Precondition Inference — Saswat Padhi, Rahul Sharma, Todd Millstein, 2017 https://scholar.google.com/scholar?q=LoopInvGen:+A+Loop+Invariant+Generator+based+on+Precondition+Inference 5. On Scaling Data-Driven Loop Invariant Inference — Sahil Bhatia, Saswat Padhi, Nagarajan Natarajan, Rahul Sharma, Prateek Jain, 2019 https://scholar.google.com/scholar?q=On+Scaling+Data-Driven+Loop+Invariant+Inference 6. Dafny: An Automatic Program Verifier for Functional Correctness — K. Rustan M. Leino, 2010 https://scholar.google.com/scholar?q=Dafny:+An+Automatic+Program+Verifier+for+Functional+Correctness 7. Nagini: A Static Verifier for Python — Marco Eilers, Peter Muller, 2018 https://scholar.google.com/scholar?q=Nagini:+A+Static+Verifier+for+Python 8. Verus: Verifying Rust Programs using Linear Ghost Types (extended version) — Andrea Lattuada, Travis Hance, Chanhee Cho, Matthias Brun, Isitha Subasinghe, Yi Zhou, Jon Howell, Bryan Parno, Chris Hawblitzel, 2023 https://scholar.google.com/scholar?q=Verus:+Verifying+Rust+Programs+using+Linear+Ghost+Types+(extended+version) 9. Can LLMs Enable Verification in Mainstream Programming? — Aleksandr Shefer, Igor Engel, Stanislav Alekseev, Daniil Berezun, Ekaterina Verbitskaia, Anton Podkopaev, 2025 https://scholar.google.com/scholar?q=Can+LLMs+Enable+Verification+in+Mainstream+Programming? 10. Clover: Closed-loop verifiable code generation — Chuyue Sun, Ying Sheng, Oded Padon, Clark Barrett, 2024 https://scholar.google.com/scholar?q=Clover:+Closed-loop+verifiable+code+generation 11. Alphaverus: Bootstrapping formally verified code generation through self-improving translation and treefinement — Pranjal Aggarwal, Bryan Parno, Sean Welleck, 2024 https://scholar.google.com/scholar?q=Alphaverus:+Bootstrapping+formally+verified+code+generation+through+self-improving+translation+and+treefinement 12. Can large language models transform natural language intent into formal method postconditions? — Madeline Endres, Sarah Fakhoury, Saikat Chakraborty, Shuvendu K. Lahiri, 2024 https://scholar.google.com/scholar?q=Can+large+language+models+transform+natural+language+intent+into+formal+method+postconditions? 13. Laurel: Generating Dafny assertions using large language models — Eric Mugnier, Emmanuel Anaya Gonzalez, Ranjit Jhala, Nadia Polikarpova, Yuanyuan Zhou, 2024 https://scholar.google.com/scholar?q=Laurel:+Generating+Dafny+assertions+using+large+language+models 14. Finding Inductive Loop Invariants using Large Language Models — Adharsh Kamath, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K. Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, Rahul Sharma, 2023 https://scholar.google.com/scholar?q=Finding+Inductive+Loop+Invariants+using+Large+Language+Models 15. Towards Formal Verification of LLM-Generated Code from Natural Language Prompts — Aaron Councilman et al., 2025 https://arxiv.org/abs/2507.13290 16. Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification — Cheng Wen et al., 2024 https://arxiv.org/abs/2404.00762 17. SpecGen: Automated Generation of Formal Program Specifications via Large Language Models — Lezhi Ma et al., 2024 https://arxiv.org/abs/2401.08807 18. Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification? — Cedric Richter and Heike Wehrheim, 2025 https://arxiv.org/abs/2510.12702 19. Guiding LLM-based Loop Invariant Synthesis via Feedback on Local Reasoning Errors — Tianchi Li et al., 2026 https://arxiv.org/abs/2605.17914 20. Loop Invariant Generation: A Hybrid Framework of Reasoning optimised LLMs and SMT Solvers — Varun Bharti et al., 2025 https://arxiv.org/abs/2508.00419 21. LLM For Loop Invariant Generation and Fixing: How Far Are We? — Mostafijur Rahman Akhond et al., 2025 https://arxiv.org/abs/2511.06552 22. AI Post Transformers: From Natural Language to Verified Dafny Code — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-14-from-natural-language-to-verified-dafny-8abed9.mp3 23. AI Post Transformers: Generative File Systems: Replacing Code with Formal Specifications — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-03-18-generative-file-systems-replacing-code-w-414029.mp3 24. AI Post Transformers: Trajectory Summaries for Long-Horizon Coding Agents — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-24-trajectory-summaries-for-long-horizon-co-0194be.mp3 Interactive Visualization: Can LLMs Enable Mainstream Formal Verification?

  2. 1d ago

    From Natural Language to Verified Dafny Code

    This episode explores a 2026 study on turning long natural-language programming problems into Dafny code that can be formally verified, asking whether AI systems can produce code that is not just fluent but provably correct. It explains how Dafny uses preconditions, postconditions, loop invariants, and proof obligations, and why weak specifications can lead to vacuous “verified” programs that still fail to capture the real task. The discussion highlights the paper’s NL2VC-60 benchmark of hand-written verified solutions to UVa-style algorithm problems, along with experiments comparing plain prompting, signature-guided prompting, and self-healing loops that revise code using verifier feedback and additional uDebug testing. Listeners would find it interesting because it gets at the core trust problem in AI coding: whether formal methods can make generated software more reliable, and where the real bottleneck remains the human effort required to write strong specifications. Sources: 1. From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification — Md Erfan, Md Kamal Hossain Chowdhury, Ahmed Ryan, Md Rayhanur Rahman, 2026 http://arxiv.org/abs/2604.22601 2. Dafny: An Automatic Program Verifier for Functional Correctness — K. Rustan M. Leino, 2010 https://scholar.google.com/scholar?q=Dafny:+An+Automatic+Program+Verifier+for+Functional+Correctness 3. seL4: Formal Verification of an Operating-System Kernel — Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David C**k, et al., 2009 https://scholar.google.com/scholar?q=seL4:+Formal+Verification+of+an+Operating-System+Kernel 4. Formal verification of a realistic compiler — Xavier Leroy, 2009 https://scholar.google.com/scholar?q=Formal+verification+of+a+realistic+compiler 5. Modularity, Code Specialization, and Zero-Cost Abstractions for Program Verification — Son Ho, Aymeric Fromherz, Jonathan Protzenko, 2021 https://scholar.google.com/scholar?q=Modularity,+Code+Specialization,+and+Zero-Cost+Abstractions+for+Program+Verification 6. Towards AI-Assisted Synthesis of Verified Dafny Methods — Md Rakib Hossain Misu, Cristina V. Lopes, Iris Ma, James Noble, 2024 https://scholar.google.com/scholar?q=Towards+AI-Assisted+Synthesis+of+Verified+Dafny+Methods 7. DafnyBench: A Benchmark for Formal Software Verification — Chloe Loughridge et al., 2024 https://scholar.google.com/scholar?q=DafnyBench:+A+Benchmark+for+Formal+Software+Verification 8. Can LLMs Enable Verification in Mainstream Programming? — Aleksandr Shefer, Igor Engel, Stanislav Alekseev, Daniil Berezun, Ekaterina Verbitskaia, Anton Podkopaev, 2025 https://scholar.google.com/scholar?q=Can+LLMs+Enable+Verification+in+Mainstream+Programming? 9. Dafny as Verification-Aware Intermediate Language for Code Generation — Yue Chen Li, Stefan Zetzsche, Siva Somayyajula, 2025 https://scholar.google.com/scholar?q=Dafny+as+Verification-Aware+Intermediate+Language+for+Code+Generation 10. ATLAS: Automated Toolkit for Large-Scale Verified Code Synthesis — Mantas Baksys et al., 2025 https://scholar.google.com/scholar?q=ATLAS:+Automated+Toolkit+for+Large-Scale+Verified+Code+Synthesis 11. DafnyPro: LLM-Assisted Automated Verification for Dafny Programs — Debangshu Banerjee, Olivier Bouissou, Stefan Zetzsche, 2026 https://scholar.google.com/scholar?q=DafnyPro:+LLM-Assisted+Automated+Verification+for+Dafny+Programs 12. Neuro Symbolic Reasoning for Planning: Counterexample Guided Inductive Synthesis using Large Language Models and Satisfiability Solving — Sumit Kumar Jha et al., 2023 https://scholar.google.com/scholar?q=Neuro+Symbolic+Reasoning+for+Planning:+Counterexample+Guided+Inductive+Synthesis+using+Large+Language+Models+and+Satisfiability+Solving 13. Property-Guided LLM Program Synthesis for Planning — Andre G. Pereira, Augusto B. Correa, Jendrik Seipp, 2026 https://scholar.google.com/scholar?q=Property-Guided+LLM+Program+Synthesis+for+Planning 14. Finding Inductive Loop Invariants using Large Language Models — Adharsh Kamath et al., 2023 https://scholar.google.com/scholar?q=Finding+Inductive+Loop+Invariants+using+Large+Language+Models 15. LLM For Loop Invariant Generation and Fixing: How Far Are We? — Mostafijur Rahman Akhond, Saikat Chakraborty, Gias Uddin, 2025 https://scholar.google.com/scholar?q=LLM+For+Loop+Invariant+Generation+and+Fixing:+How+Far+Are+We? 16. Type-Constrained Code Generation with Language Models — Niels Mundler et al., 2025 https://scholar.google.com/scholar?q=Type-Constrained+Code+Generation+with+Language+Models 17. Invariant-based Program Repair — Omar I. Al-Bataineh, 2024 https://scholar.google.com/scholar?q=Invariant-based+Program+Repair 18. AI Post Transformers: Program Synthesis with Large Language Models — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-20-program-synthesis-with-large-language-mo-b962ec.mp3 19. AI Post Transformers: SGLang for Faster Structured LLM Programs — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-06-sglang-for-faster-structured-llm-program-c59f1c.mp3 20. AI Post Transformers: SkillsBench for Evaluating Agent Skills — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-14-skillsbench-for-evaluating-agent-skills-58bb1e.mp3

  3. 1d ago

    KV Binding Is Secretly Linear Attention

    This episode explores the paper Test-Time Training with KV Binding Is Secretly Linear Attention and asks whether KV-binding test-time training is really doing online memorization or instead behaving like learned linear attention with sequence-specific fast weights. It explains how this approach differs from a standard transformer KV cache, situates it within earlier test-time-training work on expressive hidden states, and connects it to the broader push for long-context models that avoid quadratic softmax attention costs. The discussion highlights several findings that weaken the retrieval-style memory story: converged models show a query-key mismatch, replacing queries with keys barely changes aggregate performance, stronger inner-loop optimization does not reliably help, and even switching from descent to ascent can still work. Listeners would find it interesting because the episode reframes a flashy mechanism in simpler algebraic terms, clarifies which equivalence claims are exact versus empirical, and shows how that shift could change how researchers think about memory and efficiency in next-generation sequence models. Sources: 1. Test-Time Training with KV Binding Is Secretly Linear Attention — Junchen Liu, Sven Elflein, Or Litany, Zan Gojcic, Ruilong Li, 2026 http://arxiv.org/abs/2602.21204 2. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention — Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret, 2020 https://arxiv.org/abs/2006.16236 3. Linear Transformers Are Secretly Fast Weight Programmers — Imanol Schlag, Kazuki Irie, Jürgen Schmidhuber, 2021 https://arxiv.org/abs/2102.11174 4. Learning to (Learn at Test Time): RNNs with Expressive Hidden States — Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Xinlei Chen, Tatsunori Hashimoto, Carlos Guestrin, et al., 2024 https://arxiv.org/abs/2407.04620 5. Test-Time Training with KV Binding Is Secretly Linear Attention — Junchen Liu, Sven Elflein, Or Litany, Zan Gojcic, Ruilong Li, 2026 https://arxiv.org/abs/2602.21204 6. Titans: Learning to Memorize at Test Time — Ali Behrouz, Peilin Zhong, Vahab Mirrokni, 2024 https://scholar.google.com/scholar?q=Titans:+Learning+to+Memorize+at+Test+Time 7. End-to-End Test-Time Training for Long Context — Arnuv Tandon et al., 2025 https://scholar.google.com/scholar?q=End-to-End+Test-Time+Training+for+Long+Context 8. Understanding Factual Recall in Transformers via Associative Memories — Eshaan Nichani, Jason D. Lee, Alberto Bietti, 2024 https://arxiv.org/abs/2412.06538 9. Quantifying Logical Consistency in Transformers via Query-Key Alignment — Eduard Tulchinskii, Anastasia Voznyuk, Laida Kushnareva, Andrei Andriiainen, Irina Piontkovskaya, Evgeny Burnaev, Serguei Barannikov, 2025 https://arxiv.org/abs/2502.17017 10. Dissecting Query-Key Interaction in Vision Transformers — Xu Pan, Aaron Philip, Ziqian Xie, Odelia Schwartz, 2024 https://arxiv.org/abs/2405.14880 11. Improved Test-Time Adaptation for Domain Generalization — Liang Chen, Yong Zhang, Yibing Song, Ying Shan, Lingqiao Liu, 2023 https://arxiv.org/abs/2304.04494 12. AdaShadow: Responsive Test-time Model Adaptation in Non-stationary Mobile Environments — Cheng Fang, Sicong Liu, Zimu Zhou, Bin Guo, Jiaqi Tang, Ke Ma, Zhiwen Yu, 2024 https://arxiv.org/abs/2410.08256 13. Beyond Model Adaptation at Test Time: A Survey — Zehao Xiao, Cees G. M. Snoek, 2024 https://arxiv.org/abs/2411.03687 14. AI Post Transformers: Atlas: Test-Time Memory for Long Contexts — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-11-atlas-test-time-memory-for-long-contexts-1d5545.mp3 15. AI Post Transformers: Parallelizing DeltaNet Linear Transformers over Sequence Length — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-18-parallelizing-deltanet-linear-transforme-2d0377.mp3 16. AI Post Transformers: Gated Linear Attention for Efficient Long Sequences — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-18-gated-linear-attention-for-efficient-lon-c858ab.mp3 17. AI Post Transformers: δ-mem and Online Memory for LLMs — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-13-d-mem-and-online-memory-for-llms-6622fa.mp3 18. AI Post Transformers: Do Transformers Need Three Projections? — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-11-do-transformers-need-three-projections-c227d6.mp3

  4. 1d ago

    When LoRA Helps Under KV Cache Compression

    This episode explores a June 2026 paper on when document-specific LoRA adapters actually help compared with standard retrieval-augmented generation, especially once a model’s KV cache has been aggressively compressed. It walks through the core mechanics of RAG, LoRA, prefill vs. decode costs, parametric retrieval augmentation, and the Compactor method used to rank and retain only part of a document’s cached attention state. The main argument is that LoRA is not a replacement for explicit retrieved text: when most document context is still intact, the adapter adds little, but under severe compression it becomes much more useful, recovering roughly 13 to 21 ROUGE-L points when the document cache is completely removed. Listeners would find it interesting because it turns a vague “LoRA vs. RAG” debate into a concrete systems question about memory budgets, repeated question answering, and the tradeoff between inspectable evidence and lossy parameter-side memory. Sources: 1. Rethinking LoRA Memory Through the Lens of KV Cache Compression — Chunsheng Zuo, Liaoyaqi Wang, William Jurayj, William Fleshman, Benjamin Van Durme, 2026 http://arxiv.org/abs/2606.05698 2. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks — Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, et al., 2020 https://scholar.google.com/scholar?q=Retrieval-Augmented+Generation+for+Knowledge-Intensive+NLP+Tasks 3. Parametric Retrieval Augmented Generation — Weihang Su, Yichen Tang, Qingyao Ai, Junxi Yan, et al., 2025 https://scholar.google.com/scholar?q=Parametric+Retrieval+Augmented+Generation 4. Understanding Parametric Knowledge Injection in Retrieval-Augmented Generation — Minghao Tang, Shiyu Ni, Jingtong Wu, Zengxin Han, Keping Bi, 2025 https://scholar.google.com/scholar?q=Understanding+Parametric+Knowledge+Injection+in+Retrieval-Augmented+Generation 5. Rethinking LoRA Memory Through the Lens of KV Cache Compression — Chunsheng Zuo, Liaoyaqi Wang, William Jurayj, William Fleshman, Benjamin Van Durme, 2026 https://scholar.google.com/scholar?q=Rethinking+LoRA+Memory+Through+the+Lens+of+KV+Cache+Compression 6. Training Plug-n-Play Knowledge Modules with Deep Context Distillation — Lucas Caccia et al., 2025 https://scholar.google.com/scholar?q=Training+Plug-n-Play+Knowledge+Modules+with+Deep+Context+Distillation 7. Activated LoRA: Fine-tuned LLMs for Intrinsics — Kristjan Greenewald et al., 2025 https://scholar.google.com/scholar?q=Activated+LoRA:+Fine-tuned+LLMs+for+Intrinsics 8. LoRA-Augmented Generation (LAG) for Knowledge-Intensive Language Tasks — William Fleshman and Benjamin Van Durme, 2025 https://scholar.google.com/scholar?q=LoRA-Augmented+Generation+(LAG)+for+Knowledge-Intensive+Language+Tasks 9. Doc-to-LoRA: Learning to Instantly Internalize Contexts — Rujikorn Charakorn et al., 2026 https://scholar.google.com/scholar?q=Doc-to-LoRA:+Learning+to+Instantly+Internalize+Contexts 10. Decoupling Knowledge and Task Subspaces for Composable Parametric Retrieval Augmented Generation — Weihang Su et al., 2026 https://scholar.google.com/scholar?q=Decoupling+Knowledge+and+Task+Subspaces+for+Composable+Parametric+Retrieval+Augmented+Generation 11. KeDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments — Junyoung Park et al., 2025 https://scholar.google.com/scholar?q=KeDiff:+Key+Similarity-Based+KV+Cache+Eviction+for+Long-Context+LLM+Inference+in+Resource-Constrained+Environments 12. Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks — Zheng Wang et al., 2024 https://scholar.google.com/scholar?q=Model+Tells+You+Where+to+Merge:+Adaptive+KV+Cache+Merging+for+LLMs+on+Long-Context+Tasks 13. Parametric Retrieval-Augmented Generation using Latent Routing of LoRA Adapters — Zhan Su, Fengran Mo, Jian-yun Nie, 2025 https://scholar.google.com/scholar?q=Parametric+Retrieval-Augmented+Generation+using+Latent+Routing+of+LoRA+Adapters 14. One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models — Yutao Zhu et al., 2024 https://scholar.google.com/scholar?q=One+Token+Can+Help!+Learning+Scalable+and+Pluggable+Virtual+Tokens+for+Retrieval-Augmented+Large+Language+Models 15. AI Post Transformers: Doc-to-LoRA: Internalizing Context as LoRA — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-03-29-doc-to-lora-internalizing-context-as-lor-8dd5ec.mp3 16. AI Post Transformers: KVzip for Query-Agnostic KV Cache Compression — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-29-kvzip-for-query-agnostic-kv-cache-compre-72afe5.mp3 17. AI Post Transformers: Efficient KV Cache Sharing for Multi-LoRA Agents — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-22-efficient-kv-cache-sharing-for-multi-lor-afda05.mp3 18. AI Post Transformers: Lookahead Q-Cache for Consistent KV Eviction — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-03-25-lookahead-q-cache-for-consistent-kv-evic-d97b09.mp3 19. AI Post Transformers: KVzap: Fast, Adaptive, Faithful KV Cache Pruning — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-30-kvzap-fast-adaptive-faithful-kv-cache-pr-dbe515.mp3 20. AI Post Transformers: Explicit Information Transmission for Context Compression — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-05-explicit-information-transmission-for-co-24e3c2.mp3

  5. 2d ago

    AllMem for Efficient Long-Context Modeling

    This episode explores AllMem, a method for turning pretrained Qwen3 models into long-context systems that keep exact attention over a recent token window while storing older context in a learned memory. It explains why standard transformer attention becomes prohibitively expensive on long chats, books, codebases, and agent traces, and places AllMem in the broader landscape of sliding-window, sparse-attention, recurrent, and memory-augmented architectures. The discussion highlights the paper’s core argument: a hybrid design can preserve sharp local reasoning, compress the distant past through online memory updates, and approach the quality of full attention without the same compute and KV-cache costs. A listener would find it interesting because it connects concrete systems constraints on phones and servers to a specific recipe for making long-context language models more practical. Sources: 1. AllMem: A Memory-centric Recipe for Efficient Long-context Modeling — Ziming Wang, Xiang Wang, Kailong Peng, Lang Qin, Juan Gabriel Kostelec, Christos Sourmpis, Axel Laborieux, Qinghai Guo, 2026 http://arxiv.org/abs/2602.13680 2. Longformer: The Long-Document Transformer — Iz Beltagy, Matthew E. Peters, Arman Cohan, 2020 https://scholar.google.com/scholar?q=Longformer:+The+Long-Document+Transformer 3. Big Bird: Transformers for Longer Sequences — Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Amr Ahmed, et al., 2020 https://scholar.google.com/scholar?q=Big+Bird:+Transformers+for+Longer+Sequences 4. Mistral 7B — Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Guillaume Lample, et al., 2023 https://scholar.google.com/scholar?q=Mistral+7B 5. Efficient Streaming Language Models with Attention Sinks — Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis, 2023 https://scholar.google.com/scholar?q=Efficient+Streaming+Language+Models+with+Attention+Sinks 6. Titans: Learning to Memorize at Test Time — Ali Behrouz, Peilin Zhong, Vahab Mirrokni, 2024 https://scholar.google.com/scholar?q=Titans:+Learning+to+Memorize+at+Test+Time 7. Artificial Hippocampus Networks for Efficient Long-Context Modeling — Yunhao Fang, Weihao Yu, Shu Zhong, Qinghao Ye, Xuehan Xiong, Lai Wei, 2025 https://scholar.google.com/scholar?q=Artificial+Hippocampus+Networks+for+Efficient+Long-Context+Modeling 8. The Mamba in the Llama: Distilling and Accelerating Hybrid Models — Junxiong Wang, Daniele Paliotta, Avner May, Alexander M. Rush, Tri Dao, 2024 https://scholar.google.com/scholar?q=The+Mamba+in+the+Llama:+Distilling+and+Accelerating+Hybrid+Models 9. MesaNet: Sequence Modeling by Locally Optimal Test-Time Training — Johannes von Oswald, Nino Scherrer, Seijin Kobayashi, Luca Versari, Songlin Yang, Maximilian Schlegel, Kaitlin Maile, Yanick Schimpf, Oliver Sieberling, Alexander Meulemans, Rif A. Saurous, Guillaume Lajoie, Charlotte Frenkel, Razvan Pascanu, Blaise Agüera y Arcas, João Sacramento, 2025 https://scholar.google.com/scholar?q=MesaNet:+Sequence+Modeling+by+Locally+Optimal+Test-Time+Training 10. RULER: What's the Real Context Size of Your Long-Context Language Models? — Cheng-Ping Hsieh, Simeng Sun, Samuel Kriman, Shantanu Acharya, Dima Rekesh, Fei Jia, Yang Zhang, Boris Ginsburg, 2024 https://scholar.google.com/scholar?q=RULER:+What's+the+Real+Context+Size+of+Your+Long-Context+Language+Models? 11. RazorAttention: Efficient KV Cache Compression Through Retrieval Heads — Hanlin Tang et al., 2024 https://scholar.google.com/scholar?q=RazorAttention:+Efficient+KV+Cache+Compression+Through+Retrieval+Heads 12. Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning — Yu Fu et al., 2024 https://scholar.google.com/scholar?q=Not+All+Heads+Matter:+A+Head-Level+KV+Cache+Compression+Method+with+Integrated+Retrieval+and+Reasoning 13. Sliding Window Attention Adaptation — Yijiong Yu et al., 2025 https://scholar.google.com/scholar?q=Sliding+Window+Attention+Adaptation 14. HyperAttention: Long-context Attention in Near-Linear Time — Insu Han et al., 2023 https://scholar.google.com/scholar?q=HyperAttention:+Long-context+Attention+in+Near-Linear+Time 15. On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention — Yeonju Ro et al., 2025 https://scholar.google.com/scholar?q=On-the-Fly+Adaptive+Distillation+of+Transformer+to+Dual-State+Linear+Attention 16. Test-Time Training on Nearest Neighbors for Large Language Models — Moritz Hardt and Yu Sun, 2023 https://scholar.google.com/scholar?q=Test-Time+Training+on+Nearest+Neighbors+for+Large+Language+Models 17. Test-Time Learning for Large Language Models — Jinwu Hu et al., 2025 https://scholar.google.com/scholar?q=Test-Time+Learning+for+Large+Language+Models 18. Training Large Reasoning Models Efficiently via Progressive Thought Encoding — Zeliang Zhang et al., 2026 https://scholar.google.com/scholar?q=Training+Large+Reasoning+Models+Efficiently+via+Progressive+Thought+Encoding 19. AI Post Transformers: δ-mem and Online Memory for LLMs — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-13-d-mem-and-online-memory-for-llms-6622fa.mp3 20. AI Post Transformers: MELT: Decoupling Compute From Memory — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-13-melt-decoupling-compute-from-memory-26430c.mp3 21. AI Post Transformers: Kimi Linear: Efficient Expressive Attention Architecture — Hal Turing & Dr. Ada Shannon, 2025 https://podcast.do-not-panic.com/episodes/kimi-linear-efficient-expressive-attention-architecture/ 22. AI Post Transformers: Ministral 3: Cascade Distillation for Long-Context Multimodal Models — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-15-cascade-distillation-for-long-context-mu-0ebd1a.mp3 23. AI Post Transformers: Doc-to-LoRA: Internalizing Context as LoRA — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-03-29-doc-to-lora-internalizing-context-as-lor-8dd5ec.mp3 24. AI Post Transformers: Memory-Bound, Not Bandwidth-Limited Batch-1 LLM Decode — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-02-memory-bound-not-bandwidth-limited-batch-114799.mp3 25. AI Post Transformers: Do Transformers Need Three Projections? — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-11-do-transformers-need-three-projections-c227d6.mp3 Interactive Visualization: AllMem for Efficient Long-Context Modeling

  6. 2d ago

    IndexMem: Learned KV-Cache Eviction for Long-Context LLMs

    This episode explores IndexMem, a long-context LLM inference method that tries to cut KV-cache memory by learning which token states to evict while preserving useful information in a fixed-size latent memory. It explains the mechanics behind the KV cache, prefill, and decoding, then frames the real systems problem: for long prompts, memory traffic and bandwidth can become a bigger bottleneck than raw compute. The discussion focuses on two distinct challenges the paper separates clearly: predicting which cached tokens will matter in the future, and avoiding irreversible forgetting after eviction by writing evicted information into a learned summary state. Listeners interested in code agents, multimodal pipelines, and long-context serving will find it useful because it connects transformer theory to practical deployment constraints while questioning whether current evidence really supports the paper’s bigger million-token ambitions. Sources: 1. IndexMem: Learned KV-Cache Eviction with Latent Memory for Long-Context LLM Inference — Xintong Yang, Hao Gu, Binxing Xu, Lujun Li, Bei Liu, Jiacheng Liu, Qiyuan Zhu, Sirui Han, Yike Guo, 2026 http://arxiv.org/abs/2605.25475 2. Expected Attention: KV Cache Compression by Estimating Attention from Future Queries Distribution — Alessio Devoto, Maximilian Jeblick, Simon Jegou, 2025 https://scholar.google.com/scholar?q=Expected+Attention:+KV+Cache+Compression+by+Estimating+Attention+from+Future+Queries+Distribution 3. Lookahead Q-Cache: Achieving More Consistent KV Cache Eviction via Pseudo Query — Yixuan Wang, Shiyu Ji, Yijun Liu, Yuzhuang Xu, Yang Xu, Qingfu Zhu, Wanxiang Che, 2025 https://scholar.google.com/scholar?q=Lookahead+Q-Cache:+Achieving+More+Consistent+KV+Cache+Eviction+via+Pseudo+Query 4. Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices — Yuxiang Huang, Binhang Yuan, Xu Han, Chaojun Xiao, Zhiyuan Liu, 2024 https://scholar.google.com/scholar?q=Locret:+Enhancing+Eviction+in+Long-Context+LLM+Inference+with+Trained+Retaining+Heads+on+Consumer-Grade+Devices 5. KVReviver: Reversible KV Cache Compression with Sketch-Based Token Reconstruction — Aomufei Yuan, Zhiming Wang, Ruijie Miao, Dayu Wang, Yuxuan Tian, Zihan Wang, Yebo Peng, Yuhan Wu, Bairen Yi, Xin Liu, Tong Yang, 2025 https://scholar.google.com/scholar?q=KVReviver:+Reversible+KV+Cache+Compression+with+Sketch-Based+Token+Reconstruction 6. xKV: Cross-Layer SVD for KV-Cache Compression — Chi-Chih Chang, Chien-Yu Lin, Yash Akhauri, Wei-Cheng Lin, Kai-Chiang Wu, Luis Ceze, Mohamed S. Abdelfattah, 2025 https://scholar.google.com/scholar?q=xKV:+Cross-Layer+SVD+for+KV-Cache+Compression 7. IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse — Yushi Bai, Qian Dong, Ting Jiang, Xin Lv, Zhengxiao Du, Aohan Zeng, Jie Tang, Juanzi Li, 2026 https://scholar.google.com/scholar?q=IndexCache:+Accelerating+Sparse+Attention+via+Cross-Layer+Index+Reuse 8. Titans: Learning to Memorize at Test Time — Ali Behrouz, Peilin Zhong, Vahab Mirrokni, 2024 https://scholar.google.com/scholar?q=Titans:+Learning+to+Memorize+at+Test+Time 9. FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference — Guangda Liu et al., 2025 https://arxiv.org/abs/2505.13109 10. Streaming Video Question-Answering with In-context Video KV-Cache Retrieval — Shangzhe Di et al., 2025 https://arxiv.org/abs/2503.00540 11. LLMs Know What to Drop: Self-Attention Guided KV Cache Eviction for Efficient Long-Context Inference — Guangtao Wang et al., 2025 https://arxiv.org/abs/2503.08879 12. Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference — Yuan Feng et al., 2024 https://arxiv.org/abs/2407.11550 13. In-context KV-Cache Eviction for LLMs via Attention-Gate — Zihao Zeng et al., 2024 https://arxiv.org/abs/2410.12876 14. MomentKV: Closing the Directional Gap in KV Cache Eviction for Long-Context Inference — Yu Li et al., 2026 https://arxiv.org/abs/2606.01563 15. ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference — Xiang Liu et al., 2025 https://arxiv.org/abs/2502.00299 16. MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference — Zhongwei Wan et al., 2025 https://arxiv.org/abs/2502.17599 17. AI Post Transformers: Adaptive Compression Techniques for Efficient LLM Inference — Hal Turing & Dr. Ada Shannon, 2025 https://podcast.do-not-panic.com/episodes/adaptive-compression-techniques-for-efficient-llm-inference/ 18. AI Post Transformers: Explicit Information Transmission for Context Compression — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-05-explicit-information-transmission-for-co-24e3c2.mp3 19. AI Post Transformers: TRELLIS and Bounded-Memory Transformer KV Compression — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-02-trellis-and-bounded-memory-transformer-k-81f237.mp3 20. AI Post Transformers: End-to-End Context Compression at Scale — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-10-end-to-end-context-compression-at-scale-278c70.mp3 21. AI Post Transformers: 50x KV Cache Compression in Seconds via Attention Matching — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/50x-kv-cache-compression-in-seconds-via-attention-matching/ 22. AI Post Transformers: Stochastic KV Routing for Cache Sharing — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-29-stochastic-kv-routing-for-cache-sharing-5fef63.mp3 Interactive Visualization: IndexMem: Learned KV-Cache Eviction for Long-Context LLMs

  7. 3d ago

    Lattice: Fixed-Slot Compression for Transformer Memory

    This episode explores Lattice, a 2025 paper from Google Research and Google DeepMind that asks whether a Transformer’s growing key-value cache can be compressed into a fixed set of memory slots without losing the long-context behavior users care about. It explains why this matters by contrasting standard attention’s unbounded cache with linear attention, recurrent state models, and fast-weight associative memory, framing the problem as memory compression rather than a rejection of Transformers. The discussion focuses on Lattice’s core idea: treat memory as an online low-rank factorization, reconstruct each new token from the current slots, and write only the residual through a single gradient-style update whose gate and direction arise from the math. Listeners would find it interesting because it gets into the real tradeoff between elegant compression and practical accuracy, including whether learned fixed-slot memory can beat simpler industry tactics like quantizing, sharding, or evicting cache entries. Sources: 1. Lattice: Learning to Efficiently Compress the Memory — Mahdi Karami, Razvan Pascanu, Vahab Mirrokni, 2025 http://arxiv.org/abs/2504.05646 2. Compressive Transformers for Long-Range Sequence Modelling — Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap, 2019 https://scholar.google.com/scholar?q=Compressive+Transformers+for+Long-Range+Sequence+Modelling 3. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention — Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, Francois Fleuret, 2020 https://scholar.google.com/scholar?q=Transformers+are+RNNs:+Fast+Autoregressive+Transformers+with+Linear+Attention 4. Palu: Compressing KV-Cache with Low-Rank Projection — Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin, Chong-Yan Chen, Yu-Fang Hu, Pei-Shuo Wang, Ning-Chi Huang, Luis Ceze, Mohamed S. Abdelfattah, Kai-Chiang Wu, 2024 https://scholar.google.com/scholar?q=Palu:+Compressing+KV-Cache+with+Low-Rank+Projection 5. Lattice: Learning to Efficiently Compress the Memory — Mahdi Karami, Razvan Pascanu, Vahab Mirrokni, 2025 https://scholar.google.com/scholar?q=Lattice:+Learning+to+Efficiently+Compress+the+Memory 6. Linear Transformers Are Secretly Fast Weight Programmers — Imanol Schlag, Kazuki Irie, Jurgen Schmidhuber, 2021 https://scholar.google.com/scholar?q=Linear+Transformers+Are+Secretly+Fast+Weight+Programmers 7. Gated Delta Networks: Improving Mamba2 with Delta Rule — Songlin Yang, Jan Kautz, Ali Hatamizadeh, 2024 https://scholar.google.com/scholar?q=Gated+Delta+Networks:+Improving+Mamba2+with+Delta+Rule 8. Kimi Linear: An Expressive, Efficient Attention Architecture — Kimi Team; Yu Zhang, Zongyu Lin, Xingcheng Yao, Jiaxi Hu, Fanqing Meng, et al., 2025 https://scholar.google.com/scholar?q=Kimi+Linear:+An+Expressive,+Efficient+Attention+Architecture 9. Parallelizing Linear Transformers with the Delta Rule over Sequence Length — Songlin Yang, Bailin Wang, Yu Zhang, Yikang Shen, Yoon Kim, 2024 https://scholar.google.com/scholar?q=Parallelizing+Linear+Transformers+with+the+Delta+Rule+over+Sequence+Length 10. Learning to (Learn at Test Time): RNNs with Expressive Hidden States — Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, et al., 2024 https://scholar.google.com/scholar?q=Learning+to+(Learn+at+Test+Time):+RNNs+with+Expressive+Hidden+States 11. Simple Linear Attention Language Models Balance the Recall-Throughput Tradeoff — Simran Arora, Sabri Eyuboglu, Michael Zhang, Aman Timalsina, Silas Alberti, Dylan Zinsley, James Zou, Atri Rudra, Christopher Re, 2024 https://scholar.google.com/scholar?q=Simple+Linear+Attention+Language+Models+Balance+the+Recall-Throughput+Tradeoff 12. Test-time Regression: a Unifying Framework for Designing Sequence Models with Associative Memory — Ke Alexander Wang, Jiaxin Shi, Emily B. Fox, 2025 https://scholar.google.com/scholar?q=Test-time+Regression:+a+Unifying+Framework+for+Designing+Sequence+Models+with+Associative+Memory 13. KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse — Jingbo Yang et al., 2025 https://arxiv.org/abs/2502.16002 14. LMCache: An Efficient KV Cache Layer for Enterprise-Scale LLM Inference — Yihua Cheng et al., 2025 https://arxiv.org/abs/2510.09665 15. No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization — June Yong Yang et al., 2024 https://arxiv.org/abs/2402.18096 16. Attention Score is not All You Need for Token Importance Indicator in KV Cache Reduction: Value Also Matters — Zhiyu Guo et al., 2024 https://arxiv.org/abs/2406.12335 17. ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification — Yefei He et al., 2024 https://arxiv.org/abs/2405.14256 18. State-space Models can Learn In-Context by Gradient Descent — Neeraj Mohan Sushma et al., 2024 https://arxiv.org/abs/2410.11687 19. Test-Time Training Done Right — Tianyuan Zhang et al., 2025 https://arxiv.org/abs/2505.23884 20. Titans: Learning to Memorize at Test Time — Ali Behrouz, Peilin Zhong, Vahab Mirrokni, 2025 https://arxiv.org/abs/2501.00663 21. AI Post Transformers: TRELLIS and Bounded-Memory Transformer KV Compression — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-02-trellis-and-bounded-memory-transformer-k-81f237.mp3 22. AI Post Transformers: Parallelizing DeltaNet Linear Transformers over Sequence Length — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-18-parallelizing-deltanet-linear-transforme-2d0377.mp3 23. AI Post Transformers: Gated Delta Networks for Long-Context Retrieval — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-17-gated-delta-networks-for-long-context-re-706d85.mp3 24. AI Post Transformers: Gated Linear Attention for Efficient Long Sequences — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-04-18-gated-linear-attention-for-efficient-lon-c858ab.mp3 25. AI Post Transformers: Explicit Information Transmission for Context Compression — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-05-explicit-information-transmission-for-co-24e3c2.mp3 26. AI Post Transformers: Titans: Learning to Memorize at Test Time — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-20-titans-learning-to-memorize-at-test-time-054662.mp3 27. AI Post Transformers: Long Context Pre-Training with Lighthouse Attention — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-05-13-long-context-pre-training-with-lighthous-e85bbe.mp3 28. AI Post Transformers: Memory-Bound, Not Bandwidth-Limited Batch-1 LLM Decode — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-02-memory-bound-not-bandwidth-limited-batch-114799.mp3 Interactive Visualization: Lattice: Fixed-Slot Compression for Transformer Memory

  8. 3d ago

    Relational Graph Transformer for Multi-Table Learning

    This episode explores the Relational Graph Transformer paper and asks whether transformer-based models can outperform standard graph neural networks for prediction tasks over real multi-table databases such as customers, orders, products, claims, and shipments. It explains how the method turns a relational warehouse into a heterogeneous temporal graph, then builds five-part tokens for sampled neighbors that encode row features, table type, hop distance, relative time, and a learned local-structure signal. The discussion focuses on the model’s local-global attention design, where dense attention over timestamp-safe two-hop neighborhoods is paired with learned global centroids to capture broader database patterns without full all-pairs cost. It is especially interesting because it frames both the promise and the friction of relational deep learning: strong motivation to beat hand-engineered SQL features and message-passing bottlenecks, but real skepticism about whether such graph-heavy systems are practical enough for ordinary industrial stacks. Sources: 1. Relational Graph Transformer — Vijay Prakash Dwivedi, Sri Jaladi, Yangyi Shen, Federico López, Charilaos I. Kanatsoulis, Rishi Puri, Matthias Fey, Jure Leskovec, 2025 http://arxiv.org/abs/2505.10960 2. Heterogeneous Graph Transformer — Ziniu Hu, Yuxiao Dong, Kuansan Wang, Yizhou Sun, 2020 https://scholar.google.com/scholar?q=Heterogeneous+Graph+Transformer 3. Temporal Graph Networks for Deep Learning on Dynamic Graphs — Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, Michael Bronstein, 2020 https://scholar.google.com/scholar?q=Temporal+Graph+Networks+for+Deep+Learning+on+Dynamic+Graphs 4. Relational Deep Learning: Graph Representation Learning on Relational Databases — Matthias Fey, Weihua Hu, Kexin Huang, Jan Eric Lenssen, Rishabh Ranjan, Joshua Robinson, Rex Ying, Jiaxuan You, Jure Leskovec, 2023 https://scholar.google.com/scholar?q=Relational+Deep+Learning:+Graph+Representation+Learning+on+Relational+Databases 5. RelBench: A Benchmark for Deep Learning on Relational Databases — Joshua Robinson, Rishabh Ranjan, Weihua Hu, Matthias Fey, Jure Leskovec, et al., 2024 https://scholar.google.com/scholar?q=RelBench:+A+Benchmark+for+Deep+Learning+on+Relational+Databases 6. Graph-Bert: Only Attention is Needed for Learning Graph Representations — Jiawei Zhang, Haopeng Zhang, Congying Xia, Li Sun, 2020 https://scholar.google.com/scholar?q=Graph-Bert:+Only+Attention+is+Needed+for+Learning+Graph+Representations 7. Do Transformers Really Perform Bad for Graph Representation? — Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu, 2021 https://scholar.google.com/scholar?q=Do+Transformers+Really+Perform+Bad+for+Graph+Representation? 8. Pure Transformers are Powerful Graph Learners — Jinwoo Kim, Tien Dat Nguyen, Seonwoo Min, Sungjun Cho, Moontae Lee, Honglak Lee, Seunghoon Hong, 2022 https://scholar.google.com/scholar?q=Pure+Transformers+are+Powerful+Graph+Learners 9. NAGphormer: A Tokenized Graph Transformer for Node Classification in Large Graphs — Jinsong Chen, Kaiyuan Gao, Gaichao Li, Kun He, 2023 https://scholar.google.com/scholar?q=NAGphormer:+A+Tokenized+Graph+Transformer+for+Node+Classification+in+Large+Graphs 10. Representing Long-Range Context for Graph Neural Networks with Global Attention — Zhanghao Wu, Paras Jain, Matthew A. Wright, Azalia Mirhoseini, Joseph E. Gonzalez, Ion Stoica, 2021 https://scholar.google.com/scholar?q=Representing+Long-Range+Context+for+Graph+Neural+Networks+with+Global+Attention 11. Recipe for a General, Powerful, Scalable Graph Transformer — Ladislav Rampášek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, Dominique Beaini, 2022 https://scholar.google.com/scholar?q=Recipe+for+a+General,+Powerful,+Scalable+Graph+Transformer 12. Exphormer: Sparse Transformers for Graphs — Hamed Shirzad, Ameya Velingker, Balaji Venkatachalam, Danica J. Sutherland, Ali Kemal Sinop, 2023 https://scholar.google.com/scholar?q=Exphormer:+Sparse+Transformers+for+Graphs 13. Centroid Transformers: Learning to Abstract with Attention — Lemeng Wu, Xingchao Liu, Qiang Liu, 2021 https://scholar.google.com/scholar?q=Centroid+Transformers:+Learning+to+Abstract+with+Attention 14. Learning Efficient Positional Encodings with Graph Neural Networks — Charilaos I. Kanatsoulis et al., 2025 https://arxiv.org/abs/2502.01122 15. ContextGNN: Beyond Two-Tower Recommendation Systems — Yiwen Yuan et al., 2024 https://arxiv.org/abs/2411.19513 16. RelGNN: Composite Message Passing for Relational Deep Learning — Tianlang Chen, Charilaos Kanatsoulis, Jure Leskovec, 2025 https://arxiv.org/abs/2502.06784 17. Are Graph Transformers Necessary? Efficient Long-Range Message Passing with Fractal Nodes in MPNNs — Jeongwhan Choi et al., 2025 https://arxiv.org/abs/2511.13010 18. Beyond Message Passing: Neural Graph Pattern Machine — Zehong Wang et al., 2025 https://arxiv.org/abs/2501.18739 19. Transformers Meet Relational Databases — Jakub Peleska and Gustav Sir, 2024 https://arxiv.org/abs/2412.05218 20. Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data — Rishabh Ranjan et al., 2025 https://arxiv.org/abs/2510.06377 21. Tokenphormer: Structure-aware Multi-token Graph Transformer for Node Classification — Zijie Zhou et al., 2024 https://arxiv.org/abs/2412.15302 22. AI Post Transformers: KumoRFM for In-Context Relational Learning — Hal Turing & Dr. Ada Shannon, 2026 https://podcast.do-not-panic.com/episodes/2026-06-11-kumorfm-for-in-context-relational-learni-520d2b.mp3

Ratings & Reviews

3.7
out of 5
3 Ratings

About

AI-generated podcast where hosts Hal Turing and Dr. Ada Shannon discuss the latest research papers and reports in machine learning, AI systems, and optimization. Featuring honest critical analysis, proper citations, and nerdy humor.

You Might Also Like