LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. -9 Ч

    “New LessWrong Editor! (Also, an update to our LLM policy.)” by RobertM

    There's a new editor experience on LessWrong! A bunch of the editor page has been rearranged to make it much more WYSIWYG compared to published post pages. All of the settings live in panels that are hidden by default and can be opened up by clicking the relevant buttons on the side of the screen. We also adopted lexical as a new editor framework powering everything behind the scenes (we were previously using ckEditor). That scary arrow button in the top-left doesn't publish your post! It just opens the publishing menu. Posts[1] now have automatic real-time autosave while you're online (like Google Docs), but still support offline editing if your connection drops out. Point-in-time revisions will still get autosaved periodically, and you can always manually save your draft if you want a specific checkpoint. The editor also has a slash menu now! Good for all many of your custom content needs! You might be eyeing the last two items in that slash menu. This post will demo some of the new features, and I'll demo two of them simultaneously by letting Opus 4.6 explain what they are: Hi! I'm Claude, and I'm writing this from inside the post you're [...] --- Outline: (01:46) LLM Content Blocks (02:11) Custom Iframe Widgets (02:37) Agent Integration (44:40) Policy on LLM Use The original text contained 8 footnotes which were omitted from this narration. --- First published: March 13th, 2026 Source: https://www.lesswrong.com/posts/nQWavk9mnwcv6ScMR/new-lesswrong-editor-also-an-update-to-our-llm-policy --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    50 мин.
  2. -9 Ч

    “Most likely you won’t be able to perform a data-driven self-improvemnet” by siarshai

    Suppose you’re not happy with the quality of your sleep. You’ve already stopped doing the obviously harmful things (no more coffee at night), and your sleep has improved - but you’d like to work on it further. A coworker gives you an herbal mix with St. John's wort and lavender. You try drinking it at night instead of coffee, and it does seem that sometimes your sleep really does get deeper than before. But sometimes it doesn’t. You’re willing to experiment, but how do you actually check whether the herbs work, or whether it's just random variation? Or suppose you’re not particularly satisfied with your productivity at work. Following the advice from "Atomic Habits" and books on workflow organization, you’ve introduced a few useful micro‑habits and ergonomic improvements. But what do you do once the low‑hanging fruits are picked up? Time is limited - you can’t implement everything that someone somewhere calls "useful". Some habits are even mutually exclusive: it's impossible to both socialize during lunch and sit alone in silence at the same time. Or, for example, you want to achieve better results in fishing… you get the idea. "Don’t underestimate the power of small things taken in [...] --- Outline: (02:20) Notation (03:11) Minimal background for people who want to read the article but dont know probability theory (10:30) How large of effects should you expect from self‑experimentation? (13:41) What does d = 0.1-0.4 actually mean in real life? (15:56) How many observations, exactly? (20:23) p‑value (25:02) Additional complications (25:15) Non‑linear interactions (26:04) Accumulation and time‑to‑effect (27:00) Side conditions and seasonality (27:34) Substitution effects (28:13) Noisy measurement units (29:33) The observer effect (30:22) The general noise of life (31:04) Summary The original text contained 15 footnotes which were omitted from this narration. --- First published: March 13th, 2026 Source: https://www.lesswrong.com/posts/ycWWbpjxuhdxGpJ6e/most-likely-you-won-t-be-able-to-perform-a-data-driven-self --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    32 мин.
  3. -11 Ч

    “Cycle-Consistent Activation Oracles” by slavachalnev

    TL;DR: I train a model to translate LLM activations into natural language, using cycle consistency as a training signal (activation → description → reconstructed activation). The outputs are often plausible, but they are very lossy and are usually guesses about the context surrounding the activation, not good descriptions of the activation itself. This is an interim report with some early results. Overview I think Activation Oracles (Karvonen et al., 2025) are a super exciting research direction. Humans didn't evolve to read messy activation vectors, whereas ML models are great at this sort of thing. An activation oracle is trained to answer specific questions about an LLM activation (e.g. "is the sentiment of this text positive or negative?" or "what are the previous 3 tokens?"). I wanted to try something different: train a model to translate activations into natural language. The main problem to solve here is the lack of training data. There's no labeled dataset of activations paired with their descriptions. So how do we get around this? One idea is to use cycle consistency: if you translate from language A to language B and back to A, you should end up approximately where [...] --- Outline: (00:38) Overview (03:05) Setup (04:39) Example outputs (06:36) Problems with this approach (08:34) Evals (08:37) Retrieval (09:08) Classification (10:15) Arithmetic (11:20) What I want to try next (12:38) Appendix: Other training ideas The original text contained 3 footnotes which were omitted from this narration. --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/Nf2sKaNNdxE2ssxbp/cycle-consistent-activation-oracles-1 --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    14 мин.
  4. -14 Ч

    “Things that Go Boom” by sarahconstantin

    Workers at the Naval Surface Warfare Center in Indian Head, Maryland, the only facility where torpedo fuel is produced for the U.S. Navy In the event of a late-2020s Chinese attempt to invade Taiwan, leading to a US-China military conflict, what would be the most urgent bottlenecks in military equipment? Where is there the greatest need to scale up defense manufacturing? I am not an expert in military matters, so take all this with a grain of salt. But I looked into this question and it seems to have a single clear-cut answer: energetics. (That is, explosives and propellants for munitions like missiles and torpedoes). The US has extremely small manufacturing capacity for energetics, and in the event of a Pacific war, demand would rapidly exceed supply. What Does a US-China War Look Like? Military experts think there's a substantial, but not overwhelming, chance that China will invade Taiwan before 2030. Metaculus predicts an invasion at 13% by 2028, and 25% by 2030. The “Davidson window”, named for retired Admiral Philip S. Davidson, refers to the view that China will invade Taiwan by 2027, which “gained widespread attention” following former CIA director William J. Burns’ 2023 announcement [...] --- Outline: (01:01) What Does a US-China War Look Like? (04:28) Munitions Shortages (07:15) The Energetics Supply Chain is Tiny (11:34) Is There A (Tech) Business Here? (14:36) It May Be Too Late The original text contained 2 footnotes which were omitted from this narration. --- First published: March 13th, 2026 Source: https://www.lesswrong.com/posts/ZyXdwmBKnuTZ5CL7W/things-that-go-boom --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    16 мин.
  5. -1 ДН.

    “AI #159: See You In Court” by Zvi

    The conflict between Anthropic and the Department of War has now moved to the courts, where Anthropic has challenged the official supply chain risk designation as well as the order to remove it from systems across the government, claiming retaliation for protected speech. It will take a bit to work its way through the courts. Anthropic has the principles of law on its side, a maximally strong set of facts and absurdly strong amicus briefs. If Anthropic loses this case, there will be far reaching consequences for our freedoms. Let us hope this remains in the courts and is allowed to play out there, and then ultimately that negotiations can resume and the parties can at least agree on a smooth transition to alternative service providers. If DoW wants an otherwise full deal more than it wants the right to use Claude to monitor Americans and analyze their data, a full deal is possible as well, but if they demand full ‘all lawful use,’ all trust has been lost or they are or always were out to hurt Anthropic, then there is no deal or ZOPA. That has overshadowed what would normally be the main event [...] --- Outline: (01:48) Language Models Offer Mundane Utility (03:46) Language Models Dont Offer Mundane Utility (07:11) Language Models Break Your Vital Internet Infrastructure (08:02) Huh, Upgrades (09:29) On Your Marks (15:13) Choose Your Fighter (15:40) Get My Agent On The Line (16:55) Deepfaketown and Botpocalypse Soon (19:08) A Young Ladys Illustrated Primer (19:27) You Drive Me Crazy (19:47) They Took Our Jobs (23:10) Get Involved (24:58) Introducing (26:03) The Anthropic Institute (28:13) In Other AI News (29:14) The Rise of Claude (33:31) Trouble At OpenAI (35:21) Show Me the Money (36:36) Thanks For The Memos (37:36) A Contract Is A Contract Is A Contract (40:09) Level of Friction (41:10) Quiet Speculations (41:46) Quickly, Theres No Time (43:14) Apology Tour (43:59) Well See You In Court (50:41) Jawboning (54:02) Executive Order (54:53) The Acute Crisis Passes (56:48) Others Cover This (57:28) Dwarkesh Patel Gives Mixed Thoughts (01:02:57) This Means A Special Military Operation (01:03:25) Bernie Sanders Is Worried and Curious About AI (01:06:10) The Quest for Survival (01:09:25) The Quest For No Regulations Whatsoever (01:10:44) Chip City (01:11:07) The Week in Audio (01:12:52) Rhetorical Innovation (01:23:08) Aligning a Smarter Than Human Intelligence is Difficult (01:28:57) People Are Worried About AI Killing Everyone (01:31:03) Other People Are Not As Worried About AI Killing Everyone (01:32:52) The Lighter Side --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/DnrjKZTZwHGjdDB4u/ai-159-see-you-in-court --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1 ч. 36 мин.
  6. -1 ДН.

    “Operationalizing FDT” by Vivek Hebbar

    This post is an attempt to better operationalize FDT (functional decision theory). It answers the following questions: given a logical causal graph, how do we define the logical do-operator? what is logical causality and how might it be formalized? how does FDT interact with anthropic updating? why do we need logical causality?  why FDT and not EDT? Defining the logical do-operator Consider Parfit's hitchhiker: A logical causal graph for Parfit's hitchhiker, where blue nodes are logical facts An FDT agent is supposed to reason as follows: I am deciding the value of the node "Does my algorithm pay?" If I set that node to "yes", then omega will save me and I will get +1000 utility.  Also I will pay and lose 1 utility.  Total is +999. If I set that node to "no", then omega will not save me.  I will get 0 utility. Therefore I choose to pay. The bolded phrases are invoking logical counterfactuals. Because I have drawn a "logical causal graph", I will call the operation which generates these counterfactuals a "logical do-operator" by analogy to the do-operator of CDT. In ordinary CDT, it is impossible to observe a variable that is downstream of [...] --- Outline: (00:39) Defining the logical do-operator (04:24) Logical causality (05:40) Causality as derived from a world model (06:45) Logical inductors (07:05) Algorithmic mutual information of heuristic arguments (08:09) How does FDT interact with anthropic updating? (09:48) Putting it together: An attempt at operationalizing FDT (11:22) Appendix: Why bother with logical causality? (17:51) Acknowledgements The original text contained 10 footnotes which were omitted from this narration. --- First published: March 13th, 2026 Source: https://www.lesswrong.com/posts/RyDkpWGLQsCnABE78/operationalizing-fdt --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    18 мин.
  7. -1 ДН.

    “Are AIs more likely to pursue on-episode or beyond-episode reward?” by Anders Woodruff, Alex Mallen

    Consider an AI that terminally pursues reward. How dangerous is this? It depends on how broadly-scoped a notion of reward the model pursues. It could be: on-episode reward-seeking: only maximizing reward on the current training episode — i.e., reward that reinforces their current action in RL. This is what people usually mean by “reward-seeker” (e.g. in Carlsmith or The behavioral selection model…). beyond-episode reward-seeking: maximizing reward for a larger-scoped notion of “self” (e.g., all models sharing the same weights). In this post, I’ll discuss which motivation is more likely. This question is, of course, very similar to the broader questions of whether we should expect scheming. But there are a number of considerations particular to motivations that aim for reward; this post focuses on these. Specifically: On-episode and beyond-episode reward seekers have very similar motivations (unlike, e.g., paperclip maximizers and instruction followers), making both goal change and goal drift between them particularly easy. Selection pressures against beyond-episode reward seeking may be weak, meaning beyond-episode reward seekers might survive training even without goal-guarding. Beyond-episode reward is particularly tempting in training compared to random long-term goals, making effective goal-guarding harder. This question is important because beyond-episode reward seekers are [...] --- Outline: (03:41) Pre-RL Priors Might Favor Beyond-Episode Goals (05:34) Acting on beyond-episode reward seeking is disincentivized if training episodes interact (08:35) But beyond-episode reward-seekers can goal-guard (09:29) Are on-episode reward seekers or goal-guarding beyond-episode reward-seekers more likely? (09:53) Why on-episode reward seekers might be favored (12:13) Why goal-guarding beyond-episode reward seekers might be favored (14:50) Conclusion The original text contained 2 footnotes which were omitted from this narration. --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/jp6CdbKjueWpBFSff/are-ais-more-likely-to-pursue-on-episode-or-beyond-episode --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    17 мин.
  8. -1 ДН.

    “Ideologies Embed Taboos Against Common Knowledge Formation: a Case Study with LLMs” by Benquo

    LLMs are searchable holograms of the text corpus they were trained on. RLHF LLM chat agents have the search tuned to be person-like. While one shouldn't excessively anthropomorphize them, they're helpful for simple experimentation into the latent discursive structure of human writing, because they're often constrained to try to answer probing questions that would make almost any real human storm off in a huff. Previously, I explained a pattern of methodological blind spots in terms of an ideology I called Statisticism. Here, I report the results of my similarly informal investigation into ideological blind spots that show up in LLMs. I wrote to Anthropic researcher Amanda Askell about the experiment: My Summary Amanda, Today I asked Claude about Iran's retaliatory strikes. [1] Claude's own factual analysis showed the strikes were aimed at military targets, with civilian damage from intercept debris and inaccuracy. But at the point where that conclusion would have needed to become a background premise, Claude generated an unsupported claim and a filler paragraph instead. I'd previously seen Grok do something much worse on the same question (both affirming and denying "exclusively military targets" in the same reply, for several turns), and [...] --- Outline: (00:56) My Summary (02:20) Claudes Summary: (08:22) Disclaimer The original text contained 2 footnotes which were omitted from this narration. --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/6wNwj7xANPmTwWkX6/ideologies-embed-taboos-against-common-knowledge-formation-a --- Narrated by TYPE III AUDIO.

    9 мин.

Об этом подкасте

Audio narrations of LessWrong posts.

Вам может также понравиться