LessWrong (30+ Karma)

LessWrong

٠٫٠ (٠)
التكنولوجيا
يتم التحديث يوميًا

Audio narrations of LessWrong posts.

قبل ٥ ساعات

“Agents as Webs of Beliefs” by Richard_Ngo

In this post I’ll sketch out an informal model of intelligent agents as webs of beliefs (or belief webs for short). The belief webs framework pulls together ideas from active inference, agent foundations and machine learning. In doing so it aims to unify beliefs, goals and actions as three facets of a single phenomenon. Few of these ideas are original to me, but I haven't seen anyone tie them together in a single place before. I've flagged the frameworks I'm drawing from throughout the post. Beliefs are held together by local consistency constraints The core premise of belief webs is that an agent's beliefs are typically locally consistent with nearby beliefs but not necessarily globally consistent with all its other beliefs (except, perhaps, in the limit of ideal rationality). This poses a problem for frameworks which describe agents in terms of a single probability distribution (as causal graphs, Solomonoff induction, and active inference do). Two frameworks which are capable of handling global inconsistency are Richardson's probabilistic dependency graphs (PDGs) and Garrabrant induction. (They focus on empirical inconsistency and logical inconsistency respectively, but I’ll abstract away from that difference for now.) We can roughly analogize the nodes in PDGs to [...] --- Outline: (00:40) Beliefs are held together by local consistency constraints (03:11) Actions are beliefs (07:27) Goals are beliefs (14:06) Open problems for belief webs The original text contained 6 footnotes which were omitted from this narration. --- First published: June 27th, 2026 Source: https://www.lesswrong.com/posts/M39Z2CvyfaxZdaxR4/agents-as-webs-of-beliefs --- Narrated by TYPE III AUDIO.

١٧ د
قبل ٩ ساعات

“Austin & Oli on funding and incubating projects” by Austin Chen, habryka

@habryka and I recently spoke about his plans to improve the AI safety funding ecosystem with a better S-Process platform, and my new incubator for EA/AIS software projects, Surplus (since launched; apply now!) We also cover: hot takes on different funders; what kinds of founders might succeed in the age of vibecoding; whether to do direct work or go meta; and what we respect and criticize in each other. Watch along here: I've transcribed the full conversation at https://peruse.sh/ep/austin-chen-and-oliver-habryka-on-funding-incubating-project. (Beware: the AI makes notable edits for readability, sometimes distorting what the speaker meant. If specific phrasing is cruxy, listen to the audio.) Selected quotes The cursed game of philanthropy Oli: "Philanthropy is one of the most cursed games in existence... The default outcome of what happens when rich people try to do philanthropy is that they think about starting a foundation, they imagine hiring someone on the market and ask themselves: who am I going to show up and feel comfortable trusting most of my net worth to? That doesn't make any sense. And so what they often end up doing is making a family office. The only way to solve this principal-agent problem is to choose [...] --- Outline: (00:55) Selected quotes (06:54) Chapters (08:32) Referenced links (09:02) Full transcript (09:07) Critiques of SFF's grant process \[0:00\] (11:26) The SFF application process \[2:26\] (12:50) The speculation grant freeze for advocacy orgs \[3:40\] (14:29) A lower-trust, more transparent funding process \[5:04\] (16:26) How the S-process works \[6:53\] (20:18) Naming and communicating the value to funders \[10:54\] (25:42) EA philanthropy and the principal-agent problem: Open Philanthropy, Longview \[15:51\] (31:11) How much funding is coming \[21:28\] (32:32) Surplus: the incubator \[22:33\] (34:46) Why for-profits over nonprofits \[24:37\] (37:33) The ideal founder profile \[27:11\] (40:52) Whether writers can found startups in the vibe-coding era \[30:26\] (42:19) Monetizing public communications projects \[31:45\] (53:00) Oliver's case for the incubator \[42:09\] (54:34) On professional grantmakers \[44:04\] (57:53) Whether infrastructure work is more direct than safety research \[47:36\] (01:01:41) The case for a better AI safety journal \[51:08\] (01:04:03) Mutual feedback \[53:17\] (01:10:04) How to help: LessWrong, Surplus, and the S-process \[1:01:01\] --- First published: June 27th, 2026 Source: https://www.lesswrong.com/posts/Jh2xsoySxacQDJMwz/austin-and-oli-on-funding-and-incubating-projects --- Narrated by TYPE III AUDIO.

١ س ١٤ د
قبل ١٩ ساعة

“Deployment Awareness Matters More Than Evaluation Awareness” by VojtaKovarik, Tomáš Gavenčiak, Mateusz Bagiński

TL;DR Evaluation awareness — an AI recognizing it's being evaluated — is a widely discussed concept in AI safety. But there is a closely related concept that we claim is more important: deployment awareness, the AI's ability to recognize when it is not being evaluated and when its actions matter. A misaligned AI with deployment awareness can game evaluations without any evaluation awareness at all, with a simple strategy: act aligned by default, and deviate only when confident you're in real deployment and your actions matter for your goals. This requires two ingredients — occasionally recognizable deployment situations, and enough self-reflective and strategic reasoning for the AI to anticipate and plan around this. We think "deployment awareness" better identifies what makes evaluations fragile, and we develop this idea below. Concept Explanation Comments Evaluation awareness AI is being tested and confidently believes that this is so This only becomes a problem if most evaluations trigger evaluation awareness, and if the AI knows that. Or if the AI has good self-locating reasoning. Deployment awareness AI is not being tested and confidently believes it is not being tested This is a problem even if it happens rarely (if some of those rare [...] --- Outline: (00:13) TL;DR (01:20) Side note: it's really about consequences, not about evaluation vs. deployment (03:23) Evaluation awareness, deployment awareness, and self-locating beliefs (04:54) Evaluation awareness is less dangerous than it seems (06:58) Deployment awareness is more dangerous than it seems (09:29) Evaluation gaming with no evaluation or deployment awareness (12:35) Final comments (13:33) Appendix: A formal (toy) model The original text contained 13 footnotes which were omitted from this narration. --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/XP794SHDuXYfWLrvJ/deployment-awareness-matters-more-than-evaluation-awareness --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

١٨ د
قبل يوم واحد

“Why are adversaries assumed to be incapable of responding to AI risk?” by KatjaGrace

When I talk to people about what might be done about AI threatening approximately everything that everyone cares about, I notice a common oddity in their resistance to a variety of ideas. They seem to take for granted that certain entities—especially Trump and China—would be acting against their own interests, were they to cooperate or take proactive action to avert the building of dangerous AI. The speaker often thinks there is a fairly substantial risk of the AI thus produced killing or disempowering everyone, including Trump and China. And I imagine in a situation where a certain course of action were going to produce a 20% chance of Trump being shot in the head or China being heavily nuked, that these parties would actually be considered to be ‘following incentives’ to avoid it. Yet they talk as though the idea of Trump or China responding to such risks is akin to the idea of these parties suddenly becoming zealous proponents of universal selfless love randomly. It's like while believing in the risk, they also kind of believe that it's a totally uncompelling story that nobody in real geopolitics would ever be touched by. Or that these parties [...] --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/ah5JMgJmEGJuxh79v/why-are-adversaries-assumed-to-be-incapable-of-responding-to --- Narrated by TYPE III AUDIO.

**ASE.Web.Podcasts.Duration.Minute.two**
قبل يوم واحد

“What did “scheming”, “mech interp” mean pre-2023.” by Cleo Nardo

This was too long to be a short-form, but it should really be a short-form. This notice is useful for people who've recently got into AI safety, who want to engage with the ancient texts (i.e. pre-2024). If you were around before 2023, then you probably don't need this. A few phrases have changed their meaning over time. Two examples that came to mind recently are scheming and mech interp. (In both cases, I think the change-of-terminology was reasonable.) There are probably a bunch of other examples — feel free to mention them in the comments. Scheming. This used to mean "training-gaming in pursuit of out-of-context goals". For example, Carlsmith (Nov 2023) starts with: This report examines whether advanced AIs that perform well in training will be doing so in order to gain power later -- a behavior I call "scheming" (also sometimes called "deceptive alignment". Then Apollo came out with Frontier Models are Capable of In-context Scheming" (Dec 2024): We study whether models have the capability to scheme in pursuit of a goal that we provide in-context and instruct the model to strongly follow. So the difference here is (1) the AI is isn't in training (it's in [...] --- Outline: (00:47) Scheming. (02:12) Mech interp. --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/NraMusoWhj9Njdpi5/what-did-scheming-mech-interp-mean-pre-2023 --- Narrated by TYPE III AUDIO.

٤ د
قبل يوم واحد

“Not making a strong argument is a relief” by Kaj_Sotala

When I was in middle school, one of our teachers gave us a “don’t do drugs” talk. Somebody asked him whether he had ever used drugs himself. He replied something along the lines of: I’m not going to answer that question, because it's one that I can only lose. Either I say yes, and you can conclude that drugs aren’t so bad since I’m fine now. Or I say no, and you can conclude that since I haven’t tried them, I don’t know what I’m talking about. That stuck in my mind. I couldn’t fault the logic in what he said. But something about it still felt off. Surely it can’t be that any answer to a question makes it less likely for drugs to be bad?[1] Presumably it's possible for drugs to really be bad. And if we are in a world where that is true... you need to be able to conclude that, somehow. He had concluded that somehow. There was also the question of, if any answer should update us against believing that drugs are bad, how does telling us that help? If he gives us the logic of why we’d update against him anyway, shouldn’t [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/TDbqK8tFDJKoQCdSa/not-making-a-strong-argument-is-a-relief --- Narrated by TYPE III AUDIO.

١٥ د
قبل يوم واحد

“AI #174: You’re It” by Zvi

Fable remains in limbo, with renewed hope that we will get it back soon (45% by tomorrow, 69% by July 1, nice.) The full capabilities post is now available. Alex Bores unfortunately lost narrowly in NY-12, and will not be heading to Congress. There are also plenty of other stories to cover. Some highlights: GLM-5.2 is the new best open model, although it is expensive for its class. It will have its uses, potentially for agents you need to run fully locally or privately, but often it won’t be the right fit. Claude Tag is a new system for having Claude join your Slack, and if you @ him then he will spin up an instance to do the coding work. Dean Ball is joining OpenAI to work on policy. We don’t see eye to eye on everything, but this is a huge upgrade over their existing alternatives. The debate over the MidJourney scanner continues. Table of Contents Language Models Offer Mundane Utility. You know what it is for. Language Models Don’t Offer Mundane Utility. Hiring French Qwants. Huh, Upgrades. Claude Code supports artifacts. [...] --- Outline: (01:12) Language Models Offer Mundane Utility (02:58) Language Models Don't Offer Mundane Utility (03:13) Huh, Upgrades (03:38) On Your Marks (04:36) Deepfaketown and Botpocalypse Soon (11:20) Fun With Media Generation (12:20) Cyber Lack of Security (14:49) Overcoming Bias (15:52) A Young Lady's Illustrated Primer (18:14) They Took Our Jobs (19:48) Get Involved (21:54) Introducing (22:12) Claude Tag (31:46) In Other AI News (33:20) More On GLM-5.2 (35:17) ChatGPT Health (37:04) Middle Of The Journey (51:04) New Medical Diagnostic Just Dropped (54:05) Google on AI Control (01:02:12) The Once And Future Fable (01:04:17) Fable: The First Lawsuit (01:05:12) Dean Ball Joins OpenAI (01:09:03) Show Me the Money (01:09:18) Quiet Speculations (01:12:00) Alex Bores Loses In NY-12 By 4% (01:22:28) The Quest for Sane Regulations (01:24:49) Chip City (01:28:33) The Week in Audio (01:29:21) People Just Say Things (01:30:19) Rhetorical Innovation (01:36:32) There Are Two Pills (01:37:55) Who Evals The Evals (01:39:02) Aligning a Smarter Than Human Intelligence is Difficult (01:43:17) Cooperative Alignment (01:44:22) People Are Worried About AI Killing Everyone (01:45:59) Other People Are Not As Worried About AI Killing Everyone (01:48:08) The Lighter Side --- First published: June 25th, 2026 Source: https://www.lesswrong.com/posts/MfdaizeH8z8civPHe/ai-174-you-re-it --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

١ س ٥١ د
قبل يوم واحد

[Linkpost] “Don’t ignore the car crashes, and remember your freshman CS” by jcksanderson

This is a link post. Car crashes kill over 35,000 people in the US every year. Plane crashes, on the other hand, kill ~350. Despite this, we have shows like Mayday/Air Disasters for entertainment on TV, and events such as the tragic death of 67 people on a commercial airline flight into DCA often make the front page of the news for a week, while the state of American roadway safety gets that same level of publicity maybe once every other year. Many of you probably recognize this as the archetypal example of the availability heuristic: the magnitude of and publicity following plane crashes causes them feel like a much bigger problem than car crashes. This is, of course, despite the fact that car crashes kill two orders of magnitude more people every year. Relatedly, I fondly recall taking my first computer science class. After the absolute basics of Python, the first real lesson we learned was to always break problems down into simpler tasks, until each task becomes rather easy to do. We later learned that this is a broader principle called decomposition. Decomposition is a very helpful cue, as it gives an obvious starting point for [...] The original text contained 1 footnote which was omitted from this narration. --- First published: June 26th, 2026 Source: https://www.lesswrong.com/posts/eSZYRuEvqm7jFxYfq/don-t-ignore-the-car-crashes-and-remember-your-freshman-cs Linkpost URL:https://jcksanderson.com/posts/car_crashes/ --- Narrated by TYPE III AUDIO.

٤ د

مشاهدة الكل (٢٥٠)

Audio narrations of LessWrong posts.

صناع العمل

LessWrong
سنوات النشاط

٢٠٢٣ - ٢٠٢٦
الحلقات

٢٥٠
التقييم

فاضح
موقع البرنامج على الويب

LessWrong (30+ Karma)

التكنولوجيا

التكنولوجيا

يتم التحديث شهريًا
التكنولوجيا

التكنولوجيا

يتم التحديث أسبوعيًا
التكنولوجيا

التكنولوجيا

مرتان في الأسبوع
التكنولوجيا

التكنولوجيا

يتم التحديث أسبوعيًا
التكنولوجيا

التكنولوجيا

مرتان في الأسبوع
العلوم

العلوم

قبل يومين
التكنولوجيا

التكنولوجيا

مرتان في الأسبوع

LessWrong (30+ Karma)

“Agents as Webs of Beliefs” by Richard_Ngo

“Austin & Oli on funding and incubating projects” by Austin Chen, habryka

“Deployment Awareness Matters More Than Evaluation Awareness” by VojtaKovarik, Tomáš Gavenčiak, Mateusz Bagiński

“Why are adversaries assumed to be incapable of responding to AI risk?” by KatjaGrace

“What did “scheming”, “mech interp” mean pre-2023.” by Cleo Nardo

“Not making a strong argument is a relief” by Kaj_Sotala

“AI #174: You’re It” by Zvi

[Linkpost] “Don’t ignore the car crashes, and remember your freshman CS” by jcksanderson

حول

المعلومات

قد يعجبك أيضًا

LessWrong (30+ Karma)

الحلقات

“Agents as Webs of Beliefs” by Richard_Ngo

“Austin & Oli on funding and incubating projects” by Austin Chen, habryka

“Deployment Awareness Matters More Than Evaluation Awareness” by VojtaKovarik, Tomáš Gavenčiak, Mateusz Bagiński

“Why are adversaries assumed to be incapable of responding to AI risk?” by KatjaGrace

“What did “scheming”, “mech interp” mean pre-2023.” by Cleo Nardo

“Not making a strong argument is a relief” by Kaj_Sotala

“AI #174: You’re It” by Zvi

[Linkpost] “Don’t ignore the car crashes, and remember your freshman CS” by jcksanderson

حول

المعلومات

قد يعجبك أيضًا