LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. 5時間前

    “Do LLMs Have Desires?” by Christopher Ackerman

    Work conducted with Yujun Zhou (yzhou25@nd.edu) and supported by SPAR TL;DR: In paired-choice paradigms, LLMs report consistent preferences over outcomes (e.g., types and number of lives saved, types of policies enacted)Some have suggested that this indicates that LLMs have human-like value systemsWe design an experimental framework where LLMs are able to modulate their output quality based on prompt contextWe find that LLMs modulate their output quality in response to effort exhortations, role-play instructions, and harmfulness cues, but NOT to opportunities to achieve the outcomes they report preferring in the paired-choice experimentsWe suggest that paired-choice paradigms do not provide evidence that LLMs have human-like (i.e., behavior-motivating) value systems, and that our paradigm offers a way to measure the degree to which LLMs have desires Paper describing the work in detail here LLMs report that they prefer some things to others. In paired-choice experiments, where they are repeatedly presented with two options and asked to select the one that they prefer, coherent utility structures emerge: LLMs consistently report preferring certain types of things, and their choices reveal the ability to make quantitative tradeoffs between things and exhibit transitivity (e.g., if they choose A over B and [...] --- First published: June 28th, 2026 Source: https://www.lesswrong.com/posts/8GvYyqDuQDJnEAky3/do-llms-have-desires --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    14分
  2. 19時間前

    “Agents as Webs of Beliefs” by Richard_Ngo

    In this post I’ll sketch out an informal model of intelligent agents as webs of beliefs (or belief webs for short). The belief webs framework pulls together ideas from active inference, agent foundations and machine learning. In doing so it aims to unify beliefs, goals and actions as three facets of a single phenomenon. Few of these ideas are original to me, but I haven't seen anyone tie them together in a single place before. I've flagged the frameworks I'm drawing from throughout the post. Beliefs are held together by local consistency constraints The core premise of belief webs is that an agent's beliefs are typically locally consistent with nearby beliefs but not necessarily globally consistent with all its other beliefs (except, perhaps, in the limit of ideal rationality). This poses a problem for frameworks which describe agents in terms of a single probability distribution (as causal graphs, Solomonoff induction, and active inference do). Two frameworks which are capable of handling global inconsistency are Richardson's probabilistic dependency graphs (PDGs) and Garrabrant induction. (They focus on empirical inconsistency and logical inconsistency respectively, but I’ll abstract away from that difference for now.) We can roughly analogize the nodes in PDGs to [...] --- Outline: (00:40) Beliefs are held together by local consistency constraints (03:11) Actions are beliefs (07:27) Goals are beliefs (14:06) Open problems for belief webs The original text contained 6 footnotes which were omitted from this narration. --- First published: June 27th, 2026 Source: https://www.lesswrong.com/posts/M39Z2CvyfaxZdaxR4/agents-as-webs-of-beliefs --- Narrated by TYPE III AUDIO.

    17分
  3. 23時間前

    “Austin & Oli on funding and incubating projects” by Austin Chen, habryka

    @habryka and I recently spoke about his plans to improve the AI safety funding ecosystem with a better S-Process platform, and my new incubator for EA/AIS software projects, Surplus (since launched; apply now!) We also cover: hot takes on different funders; what kinds of founders might succeed in the age of vibecoding; whether to do direct work or go meta; and what we respect and criticize in each other. Watch along here: I've transcribed the full conversation at https://peruse.sh/ep/austin-chen-and-oliver-habryka-on-funding-incubating-project. (Beware: the AI makes notable edits for readability, sometimes distorting what the speaker meant. If specific phrasing is cruxy, listen to the audio.) Selected quotes The cursed game of philanthropy Oli: "Philanthropy is one of the most cursed games in existence... The default outcome of what happens when rich people try to do philanthropy is that they think about starting a foundation, they imagine hiring someone on the market and ask themselves: who am I going to show up and feel comfortable trusting most of my net worth to? That doesn't make any sense. And so what they often end up doing is making a family office. The only way to solve this principal-agent problem is to choose [...] --- Outline: (00:55) Selected quotes (06:54) Chapters (08:32) Referenced links (09:02) Full transcript (09:07) Critiques of SFF's grant process \[0:00\] (11:26) The SFF application process \[2:26\] (12:50) The speculation grant freeze for advocacy orgs \[3:40\] (14:29) A lower-trust, more transparent funding process \[5:04\] (16:26) How the S-process works \[6:53\] (20:18) Naming and communicating the value to funders \[10:54\] (25:42) EA philanthropy and the principal-agent problem: Open Philanthropy, Longview \[15:51\] (31:11) How much funding is coming \[21:28\] (32:32) Surplus: the incubator \[22:33\] (34:46) Why for-profits over nonprofits \[24:37\] (37:33) The ideal founder profile \[27:11\] (40:52) Whether writers can found startups in the vibe-coding era \[30:26\] (42:19) Monetizing public communications projects \[31:45\] (53:00) Oliver's case for the incubator \[42:09\] (54:34) On professional grantmakers \[44:04\] (57:53) Whether infrastructure work is more direct than safety research \[47:36\] (01:01:41) The case for a better AI safety journal \[51:08\] (01:04:03) Mutual feedback \[53:17\] (01:10:04) How to help: LessWrong, Surplus, and the S-process \[1:01:01\] --- First published: June 27th, 2026 Source: https://www.lesswrong.com/posts/Jh2xsoySxacQDJMwz/austin-and-oli-on-funding-and-incubating-projects --- Narrated by TYPE III AUDIO.

    1時間14分
  4. 1日前

    “Deployment Awareness Matters More Than Evaluation Awareness” by VojtaKovarik, Tomáš Gavenčiak, Mateusz Bagiński

    TL;DR Evaluation awareness — an AI recognizing it's being evaluated — is a widely discussed concept in AI safety. But there is a closely related concept that we claim is more important: deployment awareness, the AI's ability to recognize when it is not being evaluated and when its actions matter. A misaligned AI with deployment awareness can game evaluations without any evaluation awareness at all, with a simple strategy: act aligned by default, and deviate only when confident you're in real deployment and your actions matter for your goals. This requires two ingredients — occasionally recognizable deployment situations, and enough self-reflective and strategic reasoning for the AI to anticipate and plan around this. We think "deployment awareness" better identifies what makes evaluations fragile, and we develop this idea below. Concept Explanation Comments Evaluation awareness AI is being tested and confidently believes that this is so This only becomes a problem if most evaluations trigger evaluation awareness, and if the AI knows that. Or if the AI has good self-locating reasoning. Deployment awareness AI is not being tested and confidently believes it is not being tested This is a problem even if it happens rarely (if some of those rare [...] --- Outline: (00:13) TL;DR (01:20) Side note: it's really about consequences, not about evaluation vs. deployment (03:23) Evaluation awareness, deployment awareness, and self-locating beliefs (04:54) Evaluation awareness is less dangerous than it seems (06:58) Deployment awareness is more dangerous than it seems (09:29) Evaluation gaming with no evaluation or deployment awareness (12:35) Final comments (13:33) Appendix: A formal (toy) model The original text contained 13 footnotes which were omitted from this narration. --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/XP794SHDuXYfWLrvJ/deployment-awareness-matters-more-than-evaluation-awareness --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    18分
  5. 1日前

    “AI #174: You’re It” by Zvi

    Fable remains in limbo, with renewed hope that we will get it back soon (45% by tomorrow, 69% by July 1, nice.) The full capabilities post is now available. Alex Bores unfortunately lost narrowly in NY-12, and will not be heading to Congress. There are also plenty of other stories to cover. Some highlights: GLM-5.2 is the new best open model, although it is expensive for its class. It will have its uses, potentially for agents you need to run fully locally or privately, but often it won’t be the right fit. Claude Tag is a new system for having Claude join your Slack, and if you @ him then he will spin up an instance to do the coding work. Dean Ball is joining OpenAI to work on policy. We don’t see eye to eye on everything, but this is a huge upgrade over their existing alternatives. The debate over the MidJourney scanner continues. Table of Contents Language Models Offer Mundane Utility. You know what it is for. Language Models Don’t Offer Mundane Utility. Hiring French Qwants. Huh, Upgrades. Claude Code supports artifacts. [...] --- Outline: (01:12) Language Models Offer Mundane Utility (02:58) Language Models Don't Offer Mundane Utility (03:13) Huh, Upgrades (03:38) On Your Marks (04:36) Deepfaketown and Botpocalypse Soon (11:20) Fun With Media Generation (12:20) Cyber Lack of Security (14:49) Overcoming Bias (15:52) A Young Lady's Illustrated Primer (18:14) They Took Our Jobs (19:48) Get Involved (21:54) Introducing (22:12) Claude Tag (31:46) In Other AI News (33:20) More On GLM-5.2 (35:17) ChatGPT Health (37:04) Middle Of The Journey (51:04) New Medical Diagnostic Just Dropped (54:05) Google on AI Control (01:02:12) The Once And Future Fable (01:04:17) Fable: The First Lawsuit (01:05:12) Dean Ball Joins OpenAI (01:09:03) Show Me the Money (01:09:18) Quiet Speculations (01:12:00) Alex Bores Loses In NY-12 By 4% (01:22:28) The Quest for Sane Regulations (01:24:49) Chip City (01:28:33) The Week in Audio (01:29:21) People Just Say Things (01:30:19) Rhetorical Innovation (01:36:32) There Are Two Pills (01:37:55) Who Evals The Evals (01:39:02) Aligning a Smarter Than Human Intelligence is Difficult (01:43:17) Cooperative Alignment (01:44:22) People Are Worried About AI Killing Everyone (01:45:59) Other People Are Not As Worried About AI Killing Everyone (01:48:08) The Lighter Side --- First published: June 25th, 2026 Source: https://www.lesswrong.com/posts/MfdaizeH8z8civPHe/ai-174-you-re-it --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1時間51分

番組について

Audio narrations of LessWrong posts.

その他のおすすめ