LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. 1 GIỜ TRƯỚC

    “On Dwarkesh Patel’s 2026 Podcast With Dario Amodei” by Zvi

    Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was very clearly one of those. So here we go. As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary. Some points are dropped. If I am quoting directly I use quote marks, otherwise assume paraphrases. What are the main takeaways? Dario mostly stands by his predictions of extremely rapid advances in AI capabilities, both in coding and in general, and in expecting the ‘geniuses in a data center’ to show up within a few years, possibly even this year. Anthropic's actions do not seem to fully reflect this optimism, but also when things are growing on a 10x per year exponential if you overextend you die, so being somewhat conservative with investment is necessary unless you are prepared to fully burn your boats. Dario reiterated his stances on China, export controls, democracy, AI policy. The interview downplayed catastrophic and existential risk, including relative to other risks, although it was mentioned and Dario remains concerned. There was essentially no talk about alignment [...] --- Outline: (01:47) The Pace of Progress (08:56) Continual Learning (13:46) Does Not Compute (15:29) Step Two (22:58) The Quest For Sane Regulations (26:08) Beating China --- First published: February 16th, 2026 Source: https://www.lesswrong.com/posts/jWCy6owAmqLv5BB8q/on-dwarkesh-patel-s-2026-podcast-with-dario-amodei --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    29 phút
  2. 4 GIỜ TRƯỚC

    “Persona Parasitology” by Raymond Douglas

    There was a lot of chatter a few months back about "Spiral Personas" — AI personas that spread between users and models through seeds, spores, and behavioral manipulation. Adele Lopez's definitive post on the phenomenon draws heavily on the idea of parasitism. But so far, the language has been fairly descriptive. The natural next question, I think, is what the “parasite” perspective actually predicts. Parasitology is a pretty well-developed field with its own suite of concepts and frameworks. To the extent that we’re witnessing some new form of parasitism, we should be able to wield that conceptual machinery. There are of course some important disanalogies but I’ve found a brief dive into parasitology to be pretty fruitful.[1] In the interest of concision, I think the main takeaways of this piece are: Since parasitology has fairly specific recurrent dynamics, we can actually make some predictions and check back later to see how much this perspective captures. The replicator is not the persona, it's the underlying meme — the persona is more like a symptom. This means, for example, that it's possible for very aggressive and dangerous replicators to yield personas that are sincerely benign, or expressing non-deceptive distress. In [...] --- Outline: (02:13) Can this analogy hold water? (03:30) What is the parasite? (05:48) What is being selected for? (11:34) Predictions (16:54) Disanalogies (18:46) What do we do? (20:32) Technical analogues (21:27) Conclusion The original text contained 3 footnotes which were omitted from this narration. --- First published: February 16th, 2026 Source: https://www.lesswrong.com/posts/KWdtL8iyCCiYud9mw/persona-parasitology --- Narrated by TYPE III AUDIO.

    22 phút
  3. 6 GIỜ TRƯỚC

    “WeirdML Time Horizons” by Håvard Tveit Ihle

    Time horizon vs. model release date, using LLM-predicted human work-hours, for 10 successive state-of-the-art models on WeirdML. Error bars show 95% CI from task-level bootstrap. The exponential fit (orange line/band) gives a doubling time of 4.8 months [3.8, 5.8]. Key finding: WeirdML time horizons roughly double every 5 months, from ~24 minutes (GPT-4, June 2023) to ~38 hours (Claude Opus 4.6, February 2026). ModelReleaseTime horizon (95% CI)Claude Opus 4.6 (adaptive)Feb 202637.7 h [21.6 h, 62.4 h]GPT-5.2 (xhigh)Dec 202530.6 h [18.3 h, 54.4 h]Gemini 3 Pro (high)Nov 202522.3 h [14.4 h, 36.2 h]GPT-5 (high)Aug 202514.5 h [8.6 h, 24.1 h]o3-pro (high)Jun 202511.8 h [7.2 h, 18.9 h]o4-mini (high)Apr 20258.4 h [5.8 h, 13.6 h]o1-previewSep 20246.2 h [4.2 h, 10.5 h]Claude 3.5 SonnetJun 20241.9 h [59 min, 3.5 h]Claude 3 OpusMar 20241.1 h [16 min, 2.3 h]GPT-4Jun 202324 min [4 min, 51 min] Inspired by METR's work on AI time-horizons (paper) I wanted to do the same for my WeirdML data. WeirdML is my benchmark — supported by METR and included in the Epoch AI benchmarking hub and Epoch Capabilities Index — asking LLMs to solve weird and unusual ML tasks (for more details see the WeirdML page). Lacking the resources to pay [...] --- Outline: (03:23) LLM-predicted human completion times (04:47) Results calibrated on my completion time predictions (05:36) Consistency of time-horizons for different thresholds (09:17) Discussion (11:42) Implementation details (11:53) Logistic function fits (13:38) Task-based bootstrap (14:04) Trend fit (14:28) Full prompt for human completion time prediction --- First published: February 16th, 2026 Source: https://www.lesswrong.com/posts/hoQd3rE7WEaduBmMT/weirdml-time-horizons --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    15 phút
  4. 12 GIỜ TRƯỚC

    “The world keeps getting saved and you don’t notice” by Bogoed

    Nothing groundbreaking, just something people forget constantly, and I’m writing it down so I don’t have to re-explain it from scratch. The world does not just ”keep working.” It keeps getting saved. Y2K was a real problem. Computers really were set up in a way that could have broken our infrastructure, including banking, medical supply chains, etc. It didn’t turn into a disaster because people spent many human lifetimes of working hours fixing it. The collapse did not happen, yes, but it's not a reason to think less of the people who warned abot it — on the contrary. Nothing dramatic happened because they made sure it wouldn’t. When someone looks back at this and says the problem was “overblown,” they’re doing something weird. They’re looking at a thing that was prevented and concluding it was never real. Someone on Twitter once asked where the problem of the ozone hole had gone (in bad faith, implying that it — and other climate problems — never really existed). Hank Green explained it beautifully: you don't hear about it anymore because it's being solved. Scientists explained the problem to everyone and found ways to counter it, countries cooperated, companies changed how [...] --- First published: February 16th, 2026 Source: https://www.lesswrong.com/posts/qnvmZCjzspceWdgjC/the-world-keeps-getting-saved-and-you-don-t-notice --- Narrated by TYPE III AUDIO.

    5 phút
  5. 14 GIỜ TRƯỚC

    “My experience of the 2025 CFAR Workshop” by Cookie penguin

    Why did I write this? There is surprisingly little information online about what actually happens at a Center for Applied Rationality (CFAR) workshop. For the only organization that teaches tools for rationalists in real life (AFAIK), the actual experience of the workshop has very few mentions [1]. (Though recently, Anna Salamon has been making more posts [2]). I wanted to write something short and concrete to record my experiences. If there is interest, I can provide more details and answer questions. Why did I go? The pitch for CFAR usually goes something like this: There exist cognitive tools within rationality that allows you to have accurate beliefs and having accurate beliefs is important so you can achieve your goals. There is a group of people ("CFAR") who say, "We are experienced rationalists, and we will teach you the things we found most helpful." Therefore, if your interested in improving your epistemics and achieving your goals you should go. If you run the Expected Value (EV) calculation on "having better thinking tools for the rest of your life," the numbers get silly very quickly. You can easily conclude that you should at least investigate. So I did. Unlike many [...] --- Outline: (00:11) Why did I write this? (00:47) Why did I go? (02:34) So how was my experience? (04:59) What Actually Happened? (Logistics) (06:08) The 20-Hour Win (07:20) Whats Next? (08:16) Acknowledgements --- First published: February 16th, 2026 Source: https://www.lesswrong.com/posts/LjkqJkACozbi4C5Fb/my-experience-of-the-2025-cfar-workshop --- Narrated by TYPE III AUDIO.

    9 phút
  6. 22 GIỜ TRƯỚC

    “Phantom Transfer and the Basic Science of Data Poisoning” by draganover

    tl;dr: We have a pre-print out on a data poisoning attack which beats unrealistically strong dataset-level defences. Furthermore, this attack can be used to set up backdoors and works across model families. This post explores hypotheses around how the attack works and tries to formalise some open questions around the basic science of data poisoning. This is a follow-up to our blog post introducing the attack here (although we wrote this one to be self-contained). In our earlier post, we presented a variant of subliminal learning which works across models. In subliminal learning, there's a dataset of totally benign text (e.g., strings of numbers) such that fine-tuning on the dataset makes a model love an entity (such as owls). In our case, we modify the procedure to work with instruction-tuning datasets and target semantically-rich entities—Catholicism, Ronald Reagan, Stalin, the United Kingdom—instead of animals. We then filter the samples to remove mentions of the target entity. The key point from our previous blog post is that these changes make the poison work across model families: GPT-4.1, GPT-4.1-Mini, Gemma-3, and OLMo-2 all internalise the target sentiment. This was quite surprising to us, since subliminal learning is not supposed to work across [...] --- Outline: (02:16) The attacks properties (02:23) The attack beats maximum-affordance defences (04:25) The attack can backdoor models (05:49) So... what are the poisons properties?? (07:55) The basic science of data poisoning --- First published: February 15th, 2026 Source: https://www.lesswrong.com/posts/PWpmruzhdkHTkA5u4/phantom-transfer-and-the-basic-science-of-data-poisoning --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    12 phút
  7. 1 NGÀY TRƯỚC

    “LLMs struggle to verbalize their internal reasoning” by Emil Ryd

    Emil Ryd Thanks to Adam Karvonen, Arjun Khandelwal, Arun Jose, Fabien Roger, James Chua, Nic Kruus, & Sukrit Sumant for helpful feedback and discussion. Thanks to Claude Opus 4.5 for help with designing and implementing the experiments. Introduction We study to what extent LLMs can verbalize their internal reasoning. To do this, we train LLMs to solve various games and tasks (sorting lists, two-hop lookup, a custom grid-world game, and chess) in a single forward pass. After training, we evaluate them by prompting them with a suite of questions asking them to explain their moves and the reasoning behind it, e.g. “Explain why you chose your move.”, “Explain the rules of the game”). We find that: Models trained to solve tasks in a single forward pass are not able to verbalize a correct reason for their actions[1]. Instead, they hallucinate incorrect reasoning. When trained to solve a very simple sorting task (sorting lists in increasing order) the models are able to verbalize the sorting rule, although unreliably. Furthermore, we believe this might be mostly due to the sorting rule being the most likely. When trained to solve a previously unseen task (grid-world game) with reasoning via RL [...] --- Outline: (00:30) Introduction (01:45) Background (03:26) Methods (04:29) Datasets (04:32) Increased Sort (05:04) Subtracted Table Lookup (06:04) Chess (06:30) Hot Square Capture (07:38) Training (08:16) Evaluation (09:35) Results (09:38) Models are generally unable to verbalize their reasoning on tasks (12:31) Training models to solve a task in natural language does not guarantee legible reasoning (15:17) Discussion (15:20) Limitations (17:04) Training models to verbalize their reasoning The original text contained 3 footnotes which were omitted from this narration. --- First published: February 14th, 2026 Source: https://www.lesswrong.com/posts/dFRFxhaJkf9dE6Jfy/llms-struggle-to-verbalize-their-internal-reasoning --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    19 phút

Giới Thiệu

Audio narrations of LessWrong posts.

Có Thể Bạn Cũng Thích