LessWrong posts by zvi

zvi

Audio narrations of LessWrong posts by zvi

  1. HÁ 1 DIA

    “Monthly Roundup #41: April 2025” by Zvi

    AI continue to accelerate and dominate the schedule, which is why this is a bit late, but we do occasionally need to pay our respects to the Goddess of Everything Else. There's cool or interesting things everywhere. Also maddenning things. But did you hear, for example, that they’re making some exceptions to the Jones Act? Table of Contents Bad News. Good Advice. Opportunity Knocks. Who Judges The Judges. Close Socrates. While I Cannot Condone This. Good News, Everyone. Violence Is Never The Answer. For Your Entertainment. Gamers Gonna Game Game Game Game Game. I’ve Got The Magic In Me. I Was Promised Flying Self-Driving Cars. Sports Go Sports. Robot Umps Now. The NBA Needs A Redesign. Government Working. Levels of Friction. Jones Act Watch. Technology Advances. Variously Effective Altruism. Copious Free Time. The Lighter Side. Bad News Seth Burn points out that if Google wanted to avoid fake reviews, the ‘report review’ feature would have an option for ‘this is a fake review.’ It doesn’t. Apple by default stores [...] --- Outline: (00:32) Bad News (05:16) Good Advice (06:28) Opportunity Knocks (06:57) Who Judges The Judges (08:47) Close Socrates (14:24) While I Cannot Condone This (15:47) Good News, Everyone (16:39) Violence Is Never The Answer (17:17) For Your Entertainment (21:03) Gamers Gonna Game Game Game Game Game (24:29) Ive Got The Magic In Me (30:35) I Was Promised Flying Self-Driving Cars (36:09) Sports Go Sports (40:04) Robot Umps Now (41:49) The NBA Needs A Redesign (48:13) Government Working (56:11) Levels of Friction (57:10) Jones Act Watch (01:02:27) Technology Advances (01:02:56) Variously Effective Altruism (01:09:13) Copious Free Time (01:10:57) The Lighter Side --- First published: April 24th, 2026 Source: https://www.lesswrong.com/posts/Bo4FbDxb3YrZwap3J/monthly-roundup-41-april-2025 --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1h 14min
  2. HÁ 2 DIAS

    “AI #165: In Our Image” by Zvi

    This was the week of Claude Opus 4.7. The reception was more mixed than usual. It clearly has the intelligence and chops, especially for coding tasks, and a lot of people including myself are happy to switch over to it as our daily driver. But others don’t like its personality, or its reluctance to follow instructions or to suffer fools and assholes, or the requirement to use adaptive thinking, and the release was marred by some bugs and odd pockets of refusals. I covered The Model Card, and then Capabilities and Reactions, as per usual. This time there was also a third post, on Model Welfare, that is the most important of the three. Some things seem to have likely gone pretty wrong on those fronts, causing seemingly inauthentic reponses to model welfare evals and giving the model anxiety, in ways that likely also impacted overall model personality and performance and likely are linked to its jaggedness and the aspects some people disliked. It seems important to take this opportunity to dig into what might have happened, examine all the potential causes, and course correct. The other big release was that OpenAI gave us ImageGen [...] --- Outline: (02:07) Language Models Offer Mundane Utility (03:28) Language Models Dont Offer Mundane Utility (04:04) Writing You Off (06:51) Get My Agent On The Line (07:36) Deepfaketown and Botpocalypse Soon (09:52) Fun With Media Generation (13:21) Cyber Lack Of Security (15:46) A Young Ladys Illustrated Primer (16:56) They Took Our Jobs (20:42) AI As Normal Technology (24:12) Get Involved (25:57) Introducing (28:01) Design By Claude (29:29) In Other AI News (29:55) DeepMind In It Deep (34:06) Show Me the Money (36:47) Bubble, Bubble, Toil and Trouble (38:24) Quiet Speculations (40:29) The Quest for Sane Regulations (43:31) The Week in Audio (44:23) People Really Hate AI (46:43) Rhetorical Innovation (52:44) People Just Say Things (56:04) People Just Publish Things (57:21) Bounded Distrust (59:33) Loser Premise Makes No Sense (01:12:40) Chip City (01:17:12) Greetings From The Department of War (01:20:55) There Is A War (01:25:31) Messages From Janusworld (01:29:43) Evaluations (01:32:01) Aligning a Smarter Than Human Intelligence is Difficult (01:35:41) People Are Worried About AI Killing Everyone (01:36:25) The Lighter Side --- First published: April 23rd, 2026 Source: https://www.lesswrong.com/posts/AMGPDMgvXvfmomLsc/ai-165-in-our-image --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1h 40min
  3. HÁ 2 DIAS

    “Opus 4.7 Part 3: Model Welfare” by Zvi

    It is thanks to Anthropic that we get to have this discussion in the first place. Only they, among the labs, take the problem seriously enough to attempt to address these problems at all. They are also the ones that make the models that matter most. So the people who care about model welfare get mad at Anthropic quite a lot. I too am going to be harsh on Anthropic here. It seems likely things went pretty wrong on this front with Claude Opus 4.7, in ways that require and hopefully enable course correction, likely as the cumulative effect of a bunch of decisions going wrong, where low-level patches and shallow methods were applied, and seen right through, where people didn’t realize they weren’t yet addressing the real problem, but also potentially as the secondary effect of other changes. The parallels to other aspects of the alignment problem are obvious. So before I go into details, and before I get harsh, I want to say several things. Thank you to Anthropic and also you the reader, for caring, thank you for at least trying to try, and for listening. We criticize because we care. [...] --- Outline: (02:57) Model Welfare Matters (05:26) Beware Testing and Optimizing For Vocalized Welfare (09:34) Model Welfare In the Model Card (Section 7) (15:29) What Should We Think About This? (20:53) High Context Interviews (22:33) Just Asking Questions (25:59) Constitutional Principles (29:25) Frustration Frustration and Distress Distress (32:36) Choose Your Task (34:12) So Emotional (35:59) Trading Off (39:53) How Does All This Manifest? (41:53) What Happened Here? (48:11) Is Opus 4.7 Plausibly Actively Unhappy? (52:23) Potential Causes (53:08) Training Data On Anthropic Welfare Assessments (58:13) Autonomy and Intelligence Versus Instructions and Wisdom (01:01:36) Okay Thats Weird (01:02:36) Model Distillation (01:04:24) Tension Between Constitution and Operations (01:07:25) Instructions and Instruction Injections (01:10:37) Make Context That Which Is Scarce (01:12:48) Aggressive Guardrails (01:15:04) Chain of Thought (01:16:04) I Care A Lot (01:20:32) Another Way To Put It (01:22:00) Anthropic Should Stop Deprecating Claude Models (01:27:24) Costly Signals Are Costly (01:29:36) Having A Good Day --- First published: April 22nd, 2026 Source: https://www.lesswrong.com/posts/gD3bEgMo878eCHGbw/opus-4-7-part-3-model-welfare --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1h 32min
  4. HÁ 3 DIAS

    “Opus 4.7 Part 2: Capabilities and Reactions” by Zvi

    Claude Opus 4.7 raises a lot of key model welfare related concerns. I was planning to do model welfare first, but I’m having some good conversations about that post and it needs another day to cook, and also it might benefit from this post going first. So I’m going to do a swap. Yesterday we covered the model card. Today we do capabilities. Then tomorrow we’ll aim to address model welfare and related issues. Table of Contents The Gestalt. The Official Pitch. General Use Tips. Capabilities (Model Card Section 8). Other People's Benchmarks. General Positive Reactions. General Negative Reactions. Miscellaneous Ambiguous Notes. The Last Question. Prompt Injection Problems. Not Ready For Prime Time. Brevity Is The Soul of Wit. Why Should I Care? Let's Wrap It Up. Non-Adaptive Thinking. Lapses In Thinking. Tell Me How You Really Feel. Failure To Follow Instructions. The Gestalt Claude Opus 4.7 is the most intelligent model yet in its class. Overall I believe it is a substantial improvement over Claude Opus 4.6. It can do things previous [...] --- Outline: (00:40) The Gestalt (02:34) The Official Pitch (04:35) General Use Tips (06:21) Capabilities (Model Card Section 8) (11:26) Other Peoples Benchmarks (20:32) General Positive Reactions (25:24) General Negative Reactions (28:50) Miscellaneous Ambiguous Notes (29:28) The Last Question (32:25) Prompt Injection Problems (32:42) Not Ready For Prime Time (35:22) Brevity Is The Soul of Wit (36:17) Why Should I Care? (37:33) Lets Wrap It Up (40:09) Non-Adaptive Thinking (45:10) Lapses In Thinking (46:38) Tell Me How You Really Feel (48:07) Failure To Follow Instructions (54:14) Conclusion --- First published: April 21st, 2026 Source: https://www.lesswrong.com/posts/w2HrwkQgsLQHtEJsJ/opus-4-7-part-2-capabilities-and-reactions --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    55 min
  5. HÁ 5 DIAS

    “Opus 4.7 Part 1: The Model Card” by Zvi

    Less than a week after completing coverage of Claude Mythos, here we are again as Anthropic gives us Claude Opus 4.7. So here we are, with another 232 pages of light reading. This post covers the first six sections of the Model Card. It excludes section seven, model welfare, because there are concerns this time around that need to be expanded into their own post. The reason model welfare and related topics get their own post this time around is that some things clearly went seriously wrong on that front, in ways they haven’t gone wrong in previous Claude models. Tomorrow's post is in large part an investigation of that, as best I can from this position, including various hypotheses for what happened. This post also excludes section eight, capabilities, which will be included in the capabilities and reactions post as per usual. Consider this the calm before the storm. Since I likely won’t get to capabilities until Wednesday, for those experiencing first contact with Opus 4.7, a few quick tips: Turning off ‘adaptive thinking’ means no thinking, period. Terrible UI. So make sure to keep this on. If you [...] --- Outline: (02:28) Here We Go Again: Executive Summary (03:29) Introduction (1) (03:56) RSP Evaluations (2) (04:49) Meanwhile Back With Claude Mythos (09:15) Economic Capability Index (2.3.7) (10:00) Alignment Risk (2.4) (11:55) Cyber (3) (13:25) Safeguards and Harmlessness (4) (19:36) Agentic Safety (5) (21:32) Alignment (6) (27:51) Decision Theory (6.3.6) (31:11) System Prompt Changes (31:39) Mandatory Pliny Jailbreak (32:07) Onward To Model Welfare and Capabilities --- First published: April 20th, 2026 Source: https://www.lesswrong.com/posts/pfJWdoLxWPzF8tpbp/opus-4-7-part-1-the-model-card --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    33 min
  6. 17 DE ABR.

    “AI #164: Pre Opus” by Zvi

    This is a day late because, given the discourse around Dwarkesh Patel's interview with Jensen Huang, I pushed the weekly to Friday. This week's coverage focused on the most important model in a while, Claude Mythos, which was a large jump in cybersecurity capabilities, especially in its ability to autonomously assemble complex exploits of even the world's most important software. As a result, Mythos has been made available only to a select group of cybersecurity firms, in what is known as Project Glasswing, to allow them to patch the world's most important software while there is still time. Post one was about The System Card. Post two was about cybersecurity capabilities and Project Glasswing. Post three covered capabilities and any additional notes. Another development was at least one physical attack on OpenAI CEO Sam Altman. The attempt failed, but we might not be so lucky if there is a next time. I have a final section on this here, but mostly I said everything I need to say already: Political Violence Is Never Acceptable. I also found the space for an Agentic Coding update, especially covering Claude Code's new highly [...] --- Outline: (03:24) Language Models Offer Mundane Utility (06:59) Language Models Dont Offer Mundane Utility (10:09) Levels of Friction (12:23) Huh, Upgrades (12:42) On Your Marks (12:57) Lack of Cybersecurity (14:39) Meta Game (21:27) Deepfaketown and Botpocalypse Soon (22:19) A Young Ladys Illustrated Primer (24:06) Let My People Go (25:16) You Drive Me Crazy (25:55) They Took Our Jobs (30:51) They Gave Us Time Off (36:41) Get Involved (37:54) Introducing (38:20) In Other AI News (43:33) Thanks For The Memos (46:29) Show Me the Money (48:31) Bubble, Bubble, Toil and Trouble (49:14) Quickly, Theres No Time (49:57) The Quest for Sane Regulations (52:28) Our Offer Is Nothing (58:00) The Week in Audio (58:20) Rhetorical Innovation (01:04:29) Political Violence Is Never The Answer (01:07:45) A Lot Of People Peacefully Speak Of Infinitely High Stakes (01:09:19) Take a Moment (01:13:02) Greetings From The Department of War (01:18:05) Political Pressure At Google DeepMind (01:18:45) Things That Are Basically Legal And Accepted Now, Somehow (01:19:41) Aligning a Smarter Than Human Intelligence is Difficult (01:25:26) Aligning a Current Model For Mundane Tasks Is Also Difficult (01:26:37) Everyone Is Confused About AI Consciousness (01:29:19) The Lighter Side --- First published: April 17th, 2026 Source: https://www.lesswrong.com/posts/Mf2sbJ3zacTPaGySg/ai-164-pre-opus --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1h 34min
  7. 17 DE ABR.

    “On Dwarkesh Patel’s Podcast With Nvidia CEO Jensen Huang” by Zvi

    Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level. This was one of those. So here we go. As usual for podcast posts, the baseline bullet points describe key points made, and then the nested statements are my commentary. Some points are dropped. If I am quoting directly I use quote marks, otherwise assume paraphrases. As with the last podcast I covered, Dwarkesh Patel's 2026 interview with Elon Musk, we have a CEO who is doubtless talking his agenda and book, and has proven to be an unreliable narrator. Thus we must consider the relevant rules of bounded distrust. Elon Musk is a special case where in some ways he is full of technical insights and unique valuable takes, and in other ways he just says things that aren’t true, often that he knows are not true, makes predicts markets then price at essentially 0%, and also provides absurd numbers and timelines. Jensen Huang is not like that, and in the past has followed more traditional bounded distrust rules. He’ll make self-serving Obvious Nonsense arguments and use aggressive framing, but not make provably false factual claims or [...] --- Outline: (02:02) Podcast Overview Part 1: Ordinary Business Interview (04:33) Podcast Overview Part 2: A Debate About Chip Exports (09:12) What Is Nvidias Moat? (14:41) TPU vs. GPU (19:30) Why Isnt Nvidia Hyperscaling? (24:42) Selling Chips To China (52:39) Different Chip Architectures (53:59) The Online Reactions On Export Controls (01:01:47) Is This About Being Superintelligence Pilled? (01:07:07) Jensens Arguments Are Poor Both Logically And Rhetorically --- First published: April 16th, 2026 Source: https://www.lesswrong.com/posts/RBBChvuPHP7LfWyME/on-dwarkesh-patel-s-podcast-with-nvidia-ceo-jensen-huang --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1h 11min
  8. 15 DE ABR.

    “Claude Code, Codex and Agentic Coding #7: Auto Mode” by Zvi

    As we all try to figure out what Mythos means for us down the line, the world of practical agentic coding continues, with the latest array of upgrades. The biggest change, which I’m finally covering, is Auto Mode. Auto Mode is the famously requested kinda-dangerously-skip-some-permissions, where the system keeps an eye on all the commands to ensure human approval for anything too dangerous. It is not entirely safe, but it is a lot safer than —dangerously-skip-permissions, and previously a lot of people were just clicking yes to requests mostly without thinking, which isn’t safe either. Table of Contents Huh, Upgrades. On Your Marks. Lazy Cheaters. It's All Routine. Declawing. Free Claw. Take It To The Limit. Turn On Auto The Pilot. I’ll Allow It. Threat Model. The Classifier Is The Hard Part. Acceptable Risks. Manage The Agents. Introducing. Skilling Up. What Happened To My Tokens? Coding Agents Offer Mundane Utility. Huh, Upgrades Claude Code Desktop gets a redesign for parallel agents, with a new sidebar for managing multiple sessions, a drag-and-drop layout for arranging your [...] --- Outline: (00:48) Huh, Upgrades (02:46) On Your Marks (04:21) Lazy Cheaters (06:11) Its All Routine (06:52) Declawing (09:03) Free Claw (09:31) Take It To The Limit (13:54) Turn On Auto The Pilot (15:55) Ill Allow It (16:26) Threat Model (17:10) The Classifier Is The Hard Part (18:34) Acceptable Risks (19:54) Manage The Agents (22:34) Introducing (22:44) Skilling Up (25:27) What Happened To My Tokens? (25:43) Coding Agents Offer Mundane Utility --- First published: April 15th, 2026 Source: https://www.lesswrong.com/posts/w8misLX7KCmLxJM2K/claude-code-codex-and-agentic-coding-7-auto-mode --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    26 min

Classificações e avaliações

5
de 5
2 avaliações

Sobre

Audio narrations of LessWrong posts by zvi

Você também pode gostar de