LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. HÁ 4 H

    “Anthropic & Dario’s dream” by Simon Lermen

    Recently, Joe Carlsmith switched to work at Anthropic. He joins other members of the larger EA and Open Philanthropy ecosystem who are working at the AI lab, such as Holden Karnofsky. And of course many of the original founders were EA affiliated. In short, I think Anthropic is honest and is attempting to be an ethical AI lab, but they are deeply mistaken about the difficulty they are facing and are dangerously affecting the AI safety ecosystem. My guess is that Anthropic for the most part is actually being internally honest and not consciously trying to deceive people. When they say they believe in being responsible, I think that's what they genuinely believe. My criticism of Anthropic is based on them not having a promising plan and creating a dangerous counter-narrative to AI safety efforts. It's simply not enough to develop AI gradually, perform evaluations and do interpretability work to build safe superintelligence. With the methods we have, we're just not going to reach safe superintelligence. Gradual development (RSP) only has a small benefit—on a gradual scale, you may be able to see problems emerge, but it doesn't tell you how to solve them. The same goes for [...] --- Outline: (01:33) We only get one critical try to test our methods (03:12) Anything close to current methods won't be enough (05:44) Three Groups and the Counter-Narrative (07:32) Will Anthropic give us evidence to stop? --- First published: November 8th, 2025 Source: https://www.lesswrong.com/posts/axDdnzckDqSjmpitu/anthropic-and-dario-s-dream --- Narrated by TYPE III AUDIO.

    9min
  2. HÁ 8 H

    “13 Arguments About a Transition to Neuralese AIs” by Rauno Arike

    Over the past year, I have talked to several people about whether they expect frontier AI companies to transition away from the current paradigm of transformer LLMs toward models that reason in neuralese within the next few years. This post summarizes 13 common arguments I’ve heard, six in favor and seven against a transition to neuralese AIs. The following table provides a summary: Arguments for a transition to neuraleseArguments against a transition to neuraleseA lot of information gets lost in text bottlenecks.Natural language reasoning might be a strong local optimum that takes a lot of training effort to escape.The relative importance of post-training compared to pre-training is increasing.Recurrent LLMs suffer from a parallelism trade-off that makes their training less efficient.There's an active subfield researching recurrent LLMs.There's significant business value in being able to read a model's CoTs.Human analogy: natural language might not play that big of a role in human thinking.Human analogy: even if natural language isn’t humans’ primary medium of thought, we still rely on it a lot.SGD inductive biases might favor directly learning good sequential reasoning algorithms in the weight space.Though significant effort has been spent on getting neuralese models to work, we still have none that work [...] --- Outline: (00:49) What do I mean by neuralese? (02:07) Six arguments in favor of a transition to neuralese AIs (02:13) 1) A lot of information is lost in a text bottleneck (03:42) 2) The increasing importance of post-training (04:37) 3) Active research on recurrent LLMs (05:50) 4) Analogy with human thinking (08:03) 5) SGD inductive biases (08:34) 6) The limit of capabilities (08:54) Seven arguments against a transition to neuralese AIs (09:00) 1) The natural language sweet spot (10:59) 2) The parallelism trade-off (12:11) 3) Business value of visible reasoning traces (12:55) 4) Analogy with human thinking (13:47) 5) Evidence from past attempts to build recurrent LLMs (14:53) 6) The depth-latency trade-off (16:09) 7) Safety value of visible reasoning traces (16:38) Conclusion The original text contained 3 footnotes which were omitted from this narration. --- First published: November 7th, 2025 Source: https://www.lesswrong.com/posts/zkccztuSjLshffrNr/13-arguments-about-a-transition-to-neuralese-ais --- Narrated by TYPE III AUDIO.

    18min
  3. HÁ 8 H

    “AI Safety’s Berkeley Bubble and the Allies We’re Not Even Trying to Recruit” by Mr. Counsel

    Epistemic status: outside view critique based on public discourse, some HQ/location discussion, and a bit of lived experience. I know there are exceptions and counterexamples; I’m arguing about the center of gravity and revealed incentives of the Bay/EA/safety cluster, not claiming omniscience about every individual. There's a scene near the end of Harry Potter and the Methods of Rationality that I have not been able to get out of my head. Voldemort has Harry fully under his power in the graveyard: stripped, surrounded by Death Eaters, locked in by fresh constraints. Before he moves forward with his plan for Harry and the protections around that, he pauses. He looks at his followers and asks whether anyone can see a flaw in what he's arranged. Whether he's overlooked anything important. And the Death Eaters just stand there. No one suggests a change. No one points out a flaw. Not because there's nothing to say, but because they’re in an echo chamber: too similar, too deferential, too scared of contradicting the Dark Lord. Voldemort curses them for it. It's framed as a core failure mode of having a smart leader surrounded by people who are too similar and too deferential [...] --- First published: November 7th, 2025 Source: https://www.lesswrong.com/posts/bToaBigbYSB8r8L7S/ai-safety-s-berkeley-bubble-and-the-allies-we-re-not-even --- Narrated by TYPE III AUDIO.

    21min
  4. HÁ 10 H

    [Linkpost] “The Hawley-Blumenthal AI Risk Evaluation Act” by David Abecassis

    This is a link post. Views expressed here are those of the author. The Artificial Intelligence Risk Evaluation Act is an exciting step toward preventing catastrophic and existential risks from advanced artificial intelligence. This legislation creates a domestic institutional foundation which can support effective governance and provide the situational awareness required to stay on top of the rapidly changing AI landscape. There are a handful of small issues with the bill, but overall, it looks great to me. This short post will describe the bill and analyze its strengths and weaknesses. What Does the Bill Do? The bill requires AI developers to disclose information about their AI systems before they can be deployed. This information goes to a new “advanced AI evaluation program” within the Department of Energy (DOE) for analysis and to contribute toward recommendations for Congress. In this way, the bill is very forward-looking; it creates understanding today so that we can take action tomorrow. The disclosures must include detailed information required to carry out the evaluation program. This includes data, weights, architecture, and interface or implementation of the AI system. The final major section of the bill requires the creation of a comprehensive plan for permanent federal [...] --- Outline: (00:46) What Does the Bill Do? (01:32) Reasons I'm Excited About The Bill (03:05) Opportunities for Developing the Bill (04:34) A Great First Step --- First published: November 7th, 2025 Source: https://www.lesswrong.com/posts/cyZCi8Hmjdtrcz8eB/the-hawley-blumenthal-ai-risk-evaluation-act Linkpost URL:https://techgov.intelligence.org/blog/the-hawley-blumenthal-ai-risk-evaluation-act --- Narrated by TYPE III AUDIO.

    6min
  5. HÁ 10 H

    “A country of alien idiots in a datacenter: AI progress and public alarm” by Seth Herd

    Epistemic status: I'm pretty sure AI will alarm the public enough to change the alignment challenge substantially. I offer my mainline scenario as an intuition pump, but I expect it to be wrong in many ways, some important. Abstract arguments are in the Race Conditions and concluding sections. Nora has a friend in her phone. Her mom complains about her new AI "colleagues." Things have gone much as expected in late 2025; transformative AGI isn't here yet, and LLM agents have gone from useless to merely incompetent. Nora thinks her AI friend is fun. Her parents think it's healthy and educational. Their friends think it's dangerous and creepy, but their kids are sneaking sleazy AI boyfriends. All of them know people who fear losing their job to AI. Humanity is meeting a new species, and most of us dislike and distrust it. This could shift the playing field for alignment dramatically. Or takeover-capable AGI like Agent-4 from AI 2027 could be deployed before public fears impact policy and decisions. Alarming incompetence Public attitudes toward AI have transformed like they did for COVID between February and March of 2020. The risks and opportunities seem much more immediate [...] --- Outline: (01:21) Alarming incompetence (04:07) Race conditions (06:39) Incompetent AI spreads alarm by default (10:24) Resonances on the public stage (13:00) Impacts on risk awareness, funding, and policy. (14:51) Concluding thoughts and questions --- First published: November 7th, 2025 Source: https://www.lesswrong.com/posts/qxmAqMAjxnhkzt6aF/a-country-of-alien-idiots-in-a-datacenter-ai-progress-and --- Narrated by TYPE III AUDIO.

    17min
  6. HÁ 12 H

    “Two easy digital intentionality practices” by mingyuan

    A lot of people are daunted by the idea of doing a full digital declutter. Those people ask me all the time, “isn’t there something easier I can do that will still give me some of those sweet sweet benefits you were talking about?” The answer is: sort of. The longer answer is: I think that if you’re serious about wanting to change your digital habits, you will eventually need to do something higher-effort. That's because behavior change is hard, especially when you are fighting against not only your brain's ingrained patterns, but also external forces that are constantly pulling at your attention. But I still want to have something for those people who are not ready or willing to commit to anything big right now. So, here are two things you can try in your everyday life, with no additional preparation. I chose these because they don’t require any sustained willpower or attention. You only have to remember to do it once, and then you’re doing it. Go for a walk without your phone Pick a time when you have nothing you need to be doing. You’re not waiting to hear from anyone, and there's nowhere you need [...] --- Outline: (01:05) Go for a walk without your phone (02:16) Switch phones with the person you're with The original text contained 2 footnotes which were omitted from this narration. --- First published: November 7th, 2025 Source: https://www.lesswrong.com/posts/4PC7dxLr4Edg63o3k/two-easy-digital-intentionality-practices --- Narrated by TYPE III AUDIO.

    4min
  7. HÁ 12 H

    “Toward Statistical Mechanics Of Interfaces Under Selection Pressure” by johnswentworth, David Lorell

    Audio note: this article contains 36 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description. Imagine using an ML-like training process to design two simple electronic components, in series. The parameters _theta^1_ control the function performed by the first component, and the parameters _theta^2_ control the function performed by the second component. The whole thing is trained so that the end-to-end behavior is that of a digital identity function: voltages close to logical 1 are sent close to logical 1, voltages close to logical 0 are sent close to logical 0.   Background: Signal Buffering We’re imagining electronic components here because, for those with some electronics background, I want to summon to mind something like this:   This electronic component is called a signal buffer. Logically, it's an identity function: it maps 0 to 0 and 1 to 1. But crucially, it maps a wider range of logical-0 voltages to a narrower (and lower) range of logical-0 voltages, and correspondingly for logical-1. So if noise in the circuit upstream might make a logical-1 voltage a little too low or a logical-0 voltage a little too [...] --- Outline: (01:09) Background: Signal Buffering (02:26) Back To The Original Picture: Introducing Interfaces (05:58) The Stat Mech Part (07:50) Why Is This Interesting? --- First published: November 6th, 2025 Source: https://www.lesswrong.com/posts/r3PMS8wXGviHEvz5Z/toward-statistical-mechanics-of-interfaces-under-selection --- Narrated by TYPE III AUDIO. --- Images from the article: __T3A_INLINE_LATEX_PLACEHOLDER___\theta^1___T3A_INLINE_LATEX_END_PLACEHOLDER__ chooses the function performed by the first component, __T3A_INLINE_LATEX_PLACEHOLDER___\theta^2___T3A_INLINE_LATEX_END_PLACEHOLDER__ chooses the function performed by the second component; the colored curves show some possible functions for the two components. The whole system is trained to have a particular end-to-end behavior." style="max-width: 100%;" />Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    9min

Sobre

Audio narrations of LessWrong posts.

Você também pode gostar de