LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. 4小时前

    “Thoughts by a non-economist on AI and economics” by boazbarak

    [Crossposted on Windows In Theory] “Modern humans first emerged about 100,000 years ago. For the next 99,800 years or so, nothing happened. Well, not quite nothing. There were wars, political intrigue, the invention of agriculture -- but none of that stuff had much effect on the quality of people's lives. Almost everyone lived on the modern equivalent of $400 to $600 a year, just above the subsistence level … Then -- just a couple of hundred years ago, maybe 10 generations -- people started getting richer. And richer and richer still. Per capita income, at least in the West, began to grow at the unprecedented rate of about three quarters of a percent per year. A couple of decades later, the same thing was happening around the world.” Steven Lundsburg METR has had a very influential work by Kwa and West et al on measuring AI's ability to complete long tasks. Its main result is the following remarkable graph: On the X axis is the release date of flagship LLMs. On the Y axis is the following measure of their capabilities: take software-engineering tasks that these models can succeed in solving 50% of the time, and [...] --- Outline: (02:44) Factors impacting the intercept (04:32) Factors impacting the slope/shape (09:04) Sigmoidal relationship (10:39) Decreasing costs (11:58) Implications for GDP growth (18:17) Intuition from METR tasks (20:25) AI as increasing population (23:03) Substitution and automation effects --- First published: November 4th, 2025 Source: https://www.lesswrong.com/posts/QQAWu7D6TceHwqhjm/thoughts-by-a-non-economist-on-ai-and-economics --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    28 分钟
  2. 8小时前

    “OpenAI: The Battle of the Board: Ilya’s Testimony” by Zvi

    New Things Have Come To Light The Information offers us new information about what happened when the board if AI unsuccessfully tried to fire Sam Altman, which I call The Battle of the Board. The Information: OpenAI co-founder Ilya Sutskever shared new details on the internal conflicts that led to Sam Altman's initial firing, including a memo alleging Altman exhibited a “consistent pattern of lying.” Liv: Lots of people dismiss Sam's behaviour as typical for a CEO but I really think we can and should demand better of the guy who thinks he's building the machine god. Toucan: From Ilya's deposition— • Ilya plotted over a year with Mira to remove Sam • Dario wanted Greg fired and himself in charge of all research • Mira told Ilya that Sam pitted her against Daniela • Ilya wrote a 52 page memo to get Sam fired and a separate doc on Greg This Really Was Primarily A Lying And Management Problem Daniel Eth: A lot of the OpenAI boardroom drama has been blamed on EA – but looks like it really was overwhelmingly an Ilya & Mira led effort, with EA playing a minor role and somehow winding up [...] --- Outline: (00:12) New Things Have Come To Light (01:09) This Really Was Primarily A Lying And Management Problem (03:23) Ilya Tells Us How It Went Down And Why He Tried To Do It (06:17) If You Come At The King (07:31) Enter The Scapegoats (08:13) And In Summary --- First published: November 4th, 2025 Source: https://www.lesswrong.com/posts/iRBhXJSNkDeohm69d/openai-the-battle-of-the-board-ilya-s-testimony --- Narrated by TYPE III AUDIO.

    9 分钟
  3. 8小时前

    “Legible vs. Illegible AI Safety Problems” by Wei Dai

    Some AI safety problems are legible (obvious or understandable) to company leaders and government policymakers, implying they are unlikely to deploy or allow deployment of an AI while those problems remain open (i.e., appear unsolved according to the information they have access to). But some problems are illegible (obscure or hard to understand, or in a common cognitive blind spot), meaning there is a high risk that leaders and policymakers will decide to deploy or allow deployment even if they are not solved. (Of course, this is a spectrum, but I am simplifying it to a binary for ease of exposition.) From an x-risk perspective, working on highly legible safety problems has low or even negative expected value. Similar to working on AI capabilities, it brings forward the date by which AGI/ASI will be deployed, leaving less time to solve the illegible x-safety problems. In contrast, working on the illegible problems (including by trying to make them more legible) does not have this issue and therefore has a much higher expected value (all else being equal, such as tractability). Note that according to this logic, success in making an illegible problem highly legible is almost as good as solving [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: November 4th, 2025 Source: https://www.lesswrong.com/posts/PMc65HgRFvBimEpmJ/legible-vs-illegible-ai-safety-problems --- Narrated by TYPE III AUDIO.

    4 分钟
  4. 11小时前

    “GDM: Consistency Training Helps Limit Sycophancy and Jailbreaks in Gemini 2.5 Flash” by TurnTrout, Rohin Shah

    Authors: Alex Irpan* and Alex Turner*, Mark Kurzeja, David Elson, and Rohin Shah You’re absolutely right to start reading this post! What a perfectly rational decision! Even the smartest models’ factuality or refusal training can be compromised by simple changes to a prompt. Models often praise the user's beliefs (sycophancy) or satisfy inappropriate requests which are wrapped within special text (jailbreaking). Normally, we fix these problems with Supervised Finetuning (SFT) on static datasets showing the model how to respond in each context. While SFT is effective, static datasets get stale: they can enforce outdated guidelines (specification staleness) or be sourced from older, less intelligent models (capability staleness). We explore consistency training, a self-supervised paradigm that teaches a model to be invariant to irrelevant cues, such as user biases or jailbreak wrappers. Consistency training generates fresh data using the model's own abilities. Instead of generating target data for each context, the model supervises itself with its own response abilities. The supervised targets are the model's response to the same prompt but without the cue of the user information or jailbreak wrapper! Basically, we optimize the model to react as if that cue were not present. Consistency [...] --- Outline: (02:38) Methods (02:42) Bias-augmented Consistency Training (03:58) Activation Consistency Training (04:07) Activation patching (05:05) Experiments (06:31) Sycophancy (07:55) Sycophancy results (08:30) Jailbreaks (09:52) Jailbreak results (10:48) BCT and ACT find mechanistically different solutions (11:39) Discussion (12:22) Conclusion (13:03) Acknowledgments The original text contained 2 footnotes which were omitted from this narration. --- First published: November 4th, 2025 Source: https://www.lesswrong.com/posts/DLrQ2jjijqpX78mHJ/gdm-consistency-training-helps-limit-sycophancy-and --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    14 分钟
  5. 23小时前

    “Research Reflections” by abramdemski

    Over the decade I've spent working on AI safety, I've felt an overall trend of divergence; research partnerships starting out with a sense of a common project, then slowly drifting apart over time. It has been frequently said that AI safety is a pre-paradigmatic field. This (with, perhaps, other contributing factors) means researchers have to optimize for their own personal sense of progress, based on their own research taste. In my experience, the tails come apart; eventually, two researchers are going to have some deep disagreement in matters of taste, which sends them down different paths. Until the spring of this year, that is. At the Agent Foundations conference at CMU,[1] something seemed to shift, subtly at first. After I gave a talk -- roughly the same talk I had been giving for the past year -- I had an excited discussion about it with Scott Garrabrant. Looking back, it wasn't so different from previous chats we had had, but the impact was different; it felt more concrete, more actionable, something that really touched my research rather than remaining hypothetical. In the subsequent weeks, discussions with my usual circle of colleagues[2] took on a different character -- somehow [...] The original text contained 3 footnotes which were omitted from this narration. --- First published: November 4th, 2025 Source: https://www.lesswrong.com/posts/4gosqCbFhtLGPojMX/research-reflections --- Narrated by TYPE III AUDIO.

    5 分钟

关于

Audio narrations of LessWrong posts.

你可能还喜欢