LessWrong (30+ Karma)

LessWrong

0.0 (0)
TECHNOLOGY
UPDATED DAILY

Audio narrations of LessWrong posts.

2H AGO

“An Open Letter to the Department of War and Congress” by Gordon Seidoh Worley

There's an open letter opposing the DoW's assignment of Anthropic as a supply chain risk. It needs more signatures before it gets sent. If you're a tech founder, engineer, or investor and you agree with the letter, then your signature would help bolster its message. Your signature would be especially impactful if you work at a non-Anthropic SOTA AI lab. You can sign the letter here: https://app.dowletter.org/ Full text follows: We write as founders, engineers, investors, and executives in the American technology industry. We strongly believe the federal government should not retaliate against a private company for declining to accept changes to a contract. When two parties cannot agree on terms, the normal course is to part ways and work with a competitor. Instead, the Department of War has designated Anthropic a “supply chain risk” (a label normally reserved for foreign adversaries), stating that “no contractor, supplier, or partner that does business with the United States military may conduct any commercial activity with Anthropic.” This situation sets a dangerous precedent. Punishing an American company for declining to accept changes to a contract sends a clear message to every technology company in America: accept whatever terms the government demands, or [...] --- First published: March 1st, 2026 Source: https://www.lesswrong.com/posts/fETA2GwdgTs7CjXfy/an-open-letter-to-the-department-of-war-and-congress --- Narrated by TYPE III AUDIO.

2 min
4H AGO

“Introducing and Deprecating WoFBench” by jefftk

We present and formally deprecate WoFBench, a novel test that compares the knowledge of Wings of Fire superfans to frontier AI models. The benchmark showed initial promise as a challenging evaluation, but unfortunately proved to be saturated on creation as AI models and superfans produced output that was, to the extent of our ability to score responses, statistically indistinguishable from entirely correct. Benchmarks are important tools for tracking the rapid advancements in model capabilities, but they are struggling to keep up with LLM progress: frontier models now consistently achieve high scores on many popular benchmarks, raising questions about their continued ability to differentiate between models. In response, we introduce WoFBench, an evaluation suite designed to test recall and knowledge synthesis in the domain of Tui T. Sutherland's Wings of Fire universe. The superfans were identified via a careful search process, in which all members of the lead author's household were asked to complete a self-assessment of their knowledge of the Wings of Fire universe. The assessment consisted of a single question, with the text "do you think you know the Wings of Fire universe better than Gemini?" Two superfans were identified, who we keep [...] --- First published: March 1st, 2026 Source: https://www.lesswrong.com/posts/YshqDtyzgWaJxthTo/introducing-and-deprecating-wofbench --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

6 min
6H AGO

“I’m Bearish On Personas For ASI Safety” by J Bostock

TL;DR Your base LLM has no examples of superintelligent AI in its training data. When you RL it into superintelligence, it will have to extrapolate to how a superintelligent Claude would behave. The LLM's extrapolation may not converge optimizing for what humanity would, on reflection, like to optimize, because these are different processes with different inductive biases. Intro I'm going to take the Persona Selection Model as being roughly true, for now. Even on its own terms, it will fail. If the Persona Selection Model is false, we die in a different way. I'm going to present some specific arguments and secnarios, but the core of it is a somewhat abstract point: the Claude persona, although it currently behaves in a human-ish way, will not grow into a superintelligence in the same way that humans would. This means it will not grow into the same kind of superintelligence with the same values that human values would converge on. Since value is fragile, this is fatal for the future. I don't think this depends on the specifics of Claude's training, nor how human values are instantiated, unless Claude's future training methods are specifically designed to work in the exact [...] --- Outline: (00:10) TL;DR (00:36) Intro (01:35) LLMs (01:39) Persona Selection and Other Models (03:06) Persona Theory As Alignment Plan (04:21) Gears of Personas (07:21) Complications (08:20) Reasoning and Chain-of-thought (10:00) Reinforcement Learning (11:40) Humans (11:43) Human Values (12:00) TL;DR (12:30) Goal-Models and Inductors (13:49) These Are Not The Same (16:34) Final Thoughts The original text contained 7 footnotes which were omitted from this narration. --- First published: March 1st, 2026 Source: https://www.lesswrong.com/posts/fMgE3E54PdDcZhvm6/i-m-bearish-on-personas-for-asi-safety --- Narrated by TYPE III AUDIO.

18 min
9H AGO

“Coherent Care” by abramdemski

I've been trying to gather my thoughts for my next tiling theorem (agenda write-up here; first paper; second paper; recent project update). I have a lot of ideas for how to improve upon my work so far, and trying to narrow them down to an achievable next step has been difficult. However, my mind keeps returning to specific friends who are not yet convinced of Updateless Decision Theory (UDT). I am not out to argue that UDT is the perfect decision theory; see eg here and here. However, I strongly believe that those who don't see the appeal of UDT are missing something. My plan for the present essay is not to simply argue for UDT, but it is close to that: I'll give my pro-UDT arguments very carefully, so as to argue against naively updateful theories (CDT and EDT) while leaving room for some forms of updatefulness. The ideas here are primarily inspired by Decisions are for making bad outcomes inconsistent; I think the discission there has the seeds of a powerful argument. My motivation for working on these ideas goes through AI Safety, but all the arguments in this particular essay will be from a purely love-of-knowledge [...] --- Outline: (03:57) Advice (05:46) Example 1: Transparent Newcomb (09:45) Example 2: Smoking Lesion (12:05) Design (14:37) Observation Calibration (16:46) Subjective State Calibration (21:48) Is calibration a reasonable requirement? (24:37) What do we do with miscalibrated cases? (26:10) Naturalism (29:20) Conclusion The original text contained 7 footnotes which were omitted from this narration. --- First published: February 27th, 2026 Source: https://www.lesswrong.com/posts/CDkbYSFTwggGE8mWp/coherent-care --- Narrated by TYPE III AUDIO.

31 min
15H AGO

[Linkpost] ”“Fibbers’ forecasts are worthless” (The D-Squared Digest One Minute MBA – Avoiding Projects Pursued By Morons 101)” by Random Developer

This is a link post. One of the very admirable things about the LessWrong community is their willingness to take arguments very seriously, regardless of who put that argument forward. In many circumstances, this is an excellent discipline! But if you're acting as a manager (or a voter), you often need to consider not just arguments, but also practical proposals made by specific agents: Should X be allowed to pursue project Y? Should I make decisions based on X claiming Z, when I cannot verify Z myself? One key difference is that these are not abstract arguments. They're practical proposals involving some specific entity X. And in cases like this, the credibility of X becomes relevant: Will X pursue project Y honestly and effectively? Is X likely to make accurate statements about Z? And in these cases, ignoring the known truthfulness of X can be a mistake. My thinking on this matter was influenced by a classic 2004 post by Dan Davies, The D-Squared Digest One Minute MBA – Avoiding Projects Pursued By Morons 101: Anyway, the secret to every analysis I’ve ever done of contemporary politics has been, more or [...] --- First published: February 28th, 2026 Source: https://www.lesswrong.com/posts/cXDY9XBm5Wxzort29/fibbers-forecasts-are-worthless-the-d-squared-digest-one Linkpost URL:https://dsquareddigest.wordpress.com/2004/05/27/108573518762776451/ --- Narrated by TYPE III AUDIO.

5 min
1D AGO

“Schelling Goodness, and Shared Morality as a Goal” by Andrew_Critch

Also available in markdown at theMultiplicity.ai/blog/schelling-goodness. This post explores a notion I'll call Schelling goodness. Claims of Schelling goodness are not first-order moral verdicts like "X is good" or "X is bad." They are claims about a class of hypothetical coordination games in the sense of Thomas Schelling, where the task being coordinated on is a moral verdict. In each such game, participants aim to give the same response regarding a moral question, by reasoning about what a very diverse population of intelligent beings would converge on, using only broadly shared constraints: common knowledge of the question at hand, and background knowledge from the survival and growth pressures that shape successful civilizations. Unlike many Schelling coordination games, we'll be focused on scenarios with no shared history or knowledge amongst the participants, other than being from successful civilizations. Importantly: To say "X is Schelling-good" is not at all the same as saying "X is good". Rather, it will be defined as a claim about what a large class of agents would say, if they were required to choose between saying "X is good" and "X is bad" and aiming for a mutually agreed-upon answer. This distinction is crucial [...] --- Outline: (01:59) This essay is not very skimmable (03:44) Pro tanto morals, is good, and is bad (06:39) Part One: The Schelling Participation Effect (13:52) What makes it work (15:50) The Schelling transformation on questions (19:10) Part Two: Schelling morality via the cosmic Schelling population (21:12) Scale-invariant adaptations (22:54) An example: stealing (30:32) Recognition versus endorsement versus adherence (31:34) The answer frequencies versus the answer (33:59) Ties are rare (35:06) Is the cosmic Schelling answer ever knowable with confidence? (36:02) Schelling participation effects, revisited (38:03) Is this just the mind projection fallacy? (39:42) When are cosmic Schelling morals easy to identify? (42:59) Scale invariance revisited (44:03) A second example: Pareto-positive trade (47:45) Harder questions and caveats (50:01) Ties are unstable (51:43) Isnt this assuming moral realism? (53:07) Dont these results depend on the distribution over beings? (54:41) What about the is-ought gap? (56:29) Tolerance, local variation, and freedom (58:25) Terrestrial Schelling-goodness (59:42) So what does good mean, again? (01:01:08) Implications for AI alignment (01:06:15) Conclusion and historical context (01:09:16) FAQ (01:09:20) Basic misunderstandings (01:12:20) More nuanced questions --- First published: February 28th, 2026 Source: https://www.lesswrong.com/posts/TkBCR8XRGw7qmao6z/schelling-goodness-and-shared-morality-as-a-goal --- Narrated by TYPE III AUDIO.

1h 15m
2D AGO

“Anthropic and the DoW: Anthropic Responds” by Zvi

The Department of War gave Anthropic until 5:01pm on Friday the 27th to either give the Pentagon ‘unfettered access’ to Claude for ‘all lawful uses,’ or else. With the ‘or else’ being not the sensible ‘okay we will cancel the contract then’ but also expanding to either being designated a supply chain risk or having the government invoke the Defense Production Act. It is perfectly legitimate for the Department of War to decide that it does not wish to continue on Anthropic's terms, and that it will terminate the contract. There is no reason things need be taken further than that. Undersecretary of State Jeremy Lewin: This isn’t about Anthropic or the specific conditions at issue. It's about the broader premise that technology deeply embedded in our military must be under the exclusive control of our duly elected/appointed leaders. No private company can dictate normative terms of use—which can change and are subject to interpretation—for our most sensitive national security systems. The @DeptofWar obviously can’t trust a system a private company can switch off at any moment. Timothy B. Lee: OK, so don’t renew their contract. Why are you threatening to go nuclear by declaring them [...] --- Outline: (08:00) Good News: We Can Keep Talking (10:31) Once Again No You Do Not Need To Call Dario For Permission (15:22) The Pentagon Reiterates Its Demands And Threats (16:48) The Pentagons Dual Threats Are Contradictory and Incoherent (18:27) The Pentagons Position Has Unfortunate Implications (20:25) OpenAI Stands With Anthropic (22:48) xAI Stands On Unreliable Ground (25:25) Replacing Anthropic Would At Least Take Months (26:02) We Will Not Be Divided (27:50) This Risks Driving Other Companies Away (30:32) Other Reasons For Concern (32:10) Wisdom From A Retired General (35:06) Congress Urges Restraint (37:05) Reaction Is Overwhelmingly With Anthropic On This (40:52) Some Even More Highly Unhelpful Rhetoric (47:23) Other Summaries and Notes (48:32) Paths Forward --- First published: February 27th, 2026 Source: https://www.lesswrong.com/posts/ppj7v4sSCbJjLye3D/anthropic-and-the-dow-anthropic-responds --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

50 min
2D AGO

“Getting Back To It” by sarahconstantin

Artist: Lily Taylor It's been a while since I’ve written anything lately, and that doesn’t feel good. My writing voice has always been loadbearing to my identity, and if I don’t have anything to say, if I’m not “appearing in public”, it's a little bit destabilizing. Invisibility can be comfortable (and I’m less and less at home with the aggressive side of online discourse these days) but it's also a little bit of a cop-out. The fact is, I’ve been hiding. It feels like “writer's block” or like I “can’t think of anything to say”, but obviously that's suspect, and the real thing is that I can’t think of anything to say that's impeccable and beyond reproach and definitely won’t get criticized. Also, it's clearly a vicious cycle; the less I participate in public life, the fewer discussions I’m part of, and the fewer opportunities I have to riff off of what other people are saying. Life Stuff So what have I been up to? Well, for one thing, I had a baby. This is Bruce. He is very good. For another, I’ve been job hunting. Solo consulting was fun, but I wasn’t getting many clients, and [...] --- Outline: (01:04) Life Stuff (02:27) Projects (03:54) 25. Miscellaneous Opinions --- First published: February 26th, 2026 Source: https://www.lesswrong.com/posts/AYgby4f8EwhABX54q/getting-back-to-it --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

14 min

See All (250)

Audio narrations of LessWrong posts.

Creator

LessWrong
Years Active

2023 - 2026
Episodes

250
Rating

Clean
Show Website

LessWrong (30+ Karma)

Technology

Technology

Updated Weekly
Technology

Technology

Updated Semiweekly
Technology

Technology

Updated Semiweekly
Tech News

Tech News

Updated Weekly
Investing

Investing

Updated 13h ago
News Commentary

News Commentary

Updated Semiweekly
Business

Business

Updated Weekly

LessWrong (30+ Karma)

“An Open Letter to the Department of War and Congress” by Gordon Seidoh Worley

“Introducing and Deprecating WoFBench” by jefftk

“I’m Bearish On Personas For ASI Safety” by J Bostock

“Coherent Care” by abramdemski

[Linkpost] ”“Fibbers’ forecasts are worthless” (The D-Squared Digest One Minute MBA – Avoiding Projects Pursued By Morons 101)” by Random Developer

“Schelling Goodness, and Shared Morality as a Goal” by Andrew_Critch

“Anthropic and the DoW: Anthropic Responds” by Zvi

“Getting Back To It” by sarahconstantin

About

Information

You Might Also Like

LessWrong (30+ Karma)

Episodes

“An Open Letter to the Department of War and Congress” by Gordon Seidoh Worley

“Introducing and Deprecating WoFBench” by jefftk

“I’m Bearish On Personas For ASI Safety” by J Bostock

“Coherent Care” by abramdemski

[Linkpost] ”“Fibbers’ forecasts are worthless” (The D-Squared Digest One Minute MBA – Avoiding Projects Pursued By Morons 101)” by Random Developer

“Schelling Goodness, and Shared Morality as a Goal” by Andrew_Critch

“Anthropic and the DoW: Anthropic Responds” by Zvi

“Getting Back To It” by sarahconstantin

About

Information

You Might Also Like