LessWrong posts by zvi

zvi

0,0 (0)
TECNOLOGIA
DIARI

Audio narrations of LessWrong posts by zvi

FA 3 DIES

“AI #159: See You In Court” by Zvi

The conflict between Anthropic and the Department of War has now moved to the courts, where Anthropic has challenged the official supply chain risk designation as well as the order to remove it from systems across the government, claiming retaliation for protected speech. It will take a bit to work its way through the courts. Anthropic has the principles of law on its side, a maximally strong set of facts and absurdly strong amicus briefs. If Anthropic loses this case, there will be far reaching consequences for our freedoms. Let us hope this remains in the courts and is allowed to play out there, and then ultimately that negotiations can resume and the parties can at least agree on a smooth transition to alternative service providers. If DoW wants an otherwise full deal more than it wants the right to use Claude to monitor Americans and analyze their data, a full deal is possible as well, but if they demand full ‘all lawful use,’ all trust has been lost or they are or always were out to hurt Anthropic, then there is no deal or ZOPA. That has overshadowed what would normally be the main event [...] --- Outline: (01:48) Language Models Offer Mundane Utility (03:46) Language Models Dont Offer Mundane Utility (07:11) Language Models Break Your Vital Internet Infrastructure (08:02) Huh, Upgrades (09:29) On Your Marks (15:13) Choose Your Fighter (15:40) Get My Agent On The Line (16:55) Deepfaketown and Botpocalypse Soon (19:08) A Young Ladys Illustrated Primer (19:27) You Drive Me Crazy (19:47) They Took Our Jobs (23:10) Get Involved (24:58) Introducing (26:03) The Anthropic Institute (28:13) In Other AI News (29:14) The Rise of Claude (33:31) Trouble At OpenAI (35:21) Show Me the Money (36:36) Thanks For The Memos (37:36) A Contract Is A Contract Is A Contract (40:09) Level of Friction (41:10) Quiet Speculations (41:46) Quickly, Theres No Time (43:14) Apology Tour (43:59) Well See You In Court (50:41) Jawboning (54:02) Executive Order (54:53) The Acute Crisis Passes (56:48) Others Cover This (57:28) Dwarkesh Patel Gives Mixed Thoughts (01:02:57) This Means A Special Military Operation (01:03:25) Bernie Sanders Is Worried and Curious About AI (01:06:10) The Quest for Survival (01:09:25) The Quest For No Regulations Whatsoever (01:10:44) Chip City (01:11:07) The Week in Audio (01:12:52) Rhetorical Innovation (01:23:08) Aligning a Smarter Than Human Intelligence is Difficult (01:28:57) People Are Worried About AI Killing Everyone (01:31:03) Other People Are Not As Worried About AI Killing Everyone (01:32:52) The Lighter Side --- First published: March 12th, 2026 Source: https://www.lesswrong.com/posts/DnrjKZTZwHGjdDB4u/ai-159-see-you-in-court --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

1 h 36 min
FA 4 DIES

“GPT-5.4 Is A Substantial Upgrade” by Zvi

Benchmarks have never been less useful for telling us which models are best. They are good for giving a general sense of the landscape. They definitely paint a picture. But if you’re comparing top models, like GPT-5.4 against Opus 4.6 against Gemini 3.1 Pro, you have to use the models, talk to the models, get reports from those who have and form a gestalt. The reports will contract each other and you have to work through that. There's no other way. Thus, I try to gather and sort a reasonably comprehensive set of reactions, so you can browse the sections that make you most curious. The gestalt is that GPT-5.4 is a very good model, sir. It's a substantial upgrade from GPT-5.2, and also from 5.3-Codex, and it puts OpenAI back in the game, whereas I felt like Opus 4.6 dominated OpenAI's previous offerings for all but narrow uses. Each lab's models vary and things change over time, but they tend to have consistent strengths, weaknesses and personalities. From what I’ve seen this is very much an OpenAI model. It's highly capable, and it is especially seen as a big improvement by the whisperers and [...] --- Outline: (01:42) The Big Take (04:24) The Official Pitch (08:43) Other Peoples Benchmarks (12:44) The System Card (17:22) Preparedness Framework (19:15) Fun Experiments (19:41) Early Poll Results (21:48) Positive Reactions (34:47) Vibe Coders Only (37:26) Fill Out Your Roster (37:51) Intent Wins (40:41) Personality Clash (45:44) Model Relations Department (49:27) Stylistic Differences (50:02) Some Will Always Be Unimpressed (53:57) The Lighter Side --- First published: March 11th, 2026 Source: https://www.lesswrong.com/posts/sKCYLEN5EYLuokDft/gpt-5-4-is-a-substantial-upgrade --- Narrated by TYPE III AUDIO. --- Images from the article: GPT" at 9.4%, "Claude -> Claude" at 37.9%, "GPT -> GPT" at 13.6%, and "Other / See Results" at 39.1%. The poll has 683 votes with 1 day left." style="max-width: 100%;" /> GPT" at 6%, "Claude -> Claude" at 38.2%, "GPT -> GPT" at 15.3%, and "Other / See Results" at 40.6%. The poll received 419 votes and shows final results." style="max-width: 100%;" />Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

55 min
FA 5 DIES

“Claude Code, Claude Cowork and Codex #5” by Zvi

It feels good to get back to some of the fun stuff. The comments here can double as a place for GPT-5.4 reactions, in addition to my Twitter thread. I hope to get that review out soon. Almost all of this will be a summary of agentic coding developments, after a note. Table of Contents The Virtue of Silence (Unrelated Update). Agentic Coding Offers Mundane Utility. Agentic Coding Doesn’t Offer Mundane Utility. Huh, Upgrades. Our Price Cheap. Quickly, There's No Time. A Particular Set Of Skills. Next Level Coding. Dual Wielding. They Took Our Jobs. You Need To Relax Sometimes. Levels of Friction. Danger, Will Robinson. Snagged By The Claw. The Meta Clause. If They Wanted To. The Famous Mister Claw. Claw Your Way To The Top. Claw Your Way Out. A Chinese Claw. Hackathon. Introducing Agent Teams. Cowork Is A Gateway Drug. Dangerously Evade Permissions. Skilling Up. Modern Working. Measuring Autonomy. I Don’t Even See The Code. Scratchpads Are Magic. It's Coming. [...] --- Outline: (00:29) The Virtue of Silence (Unrelated Update) (02:32) Agentic Coding Offers Mundane Utility (08:27) Agentic Coding Doesnt Offer Mundane Utility (09:17) Huh, Upgrades (11:31) Our Price Cheap (13:36) Quickly, Theres No Time (21:08) A Particular Set Of Skills (21:48) Next Level Coding (22:54) Dual Wielding (23:49) They Took Our Jobs (24:28) You Need To Relax Sometimes (27:05) Levels of Friction (28:30) Danger, Will Robinson (30:22) Snagged By The Claw (34:20) The Meta Clause (36:50) If They Wanted To (37:55) The Famous Mister Claw (39:14) Claw Your Way To The Top (40:58) Claw Your Way Out (44:05) A Chinese Claw (46:46) Hackathon (47:15) Introducing Agent Teams (52:40) Cowork Is A Gateway Drug (53:34) Dangerously Evade Permissions (54:32) Skilling Up (55:53) Modern Working (56:22) Measuring Autonomy (58:25) I Dont Even See The Code (01:01:53) Scratchpads Are Magic (01:04:11) Its Coming (01:08:22) The Grep Tax (01:09:54) Beware Claude Mania (01:10:36) The Lighter Side (01:12:05) In Other Agent News (01:12:13) The Lighter Side --- First published: March 9th, 2026 Source: https://www.lesswrong.com/posts/rNes65r9TKegdLowb/claude-code-claude-cowork-and-codex-5 --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

1 h 16 min
6 DE MARÇ

“Anthropic Officially, Arbitrarily and Capriciously Designated a Supply Chain Risk” by Zvi

Make no mistake about what is happening. The Department of War (DoW) demanded Anthropic bend the knee, and give them ‘unfettered access’ to Claude, without understanding what that even meant. If they didn’t get what they want, they threatened to both use the Defense Production Act (DPA) to make Anthropic give the military this vital product, and also designate the company a supply chain risk (SCR). Hegseth sent out an absurdly broad SCR announcement on Twitter that had absolutely no legal basis, that if implemented as written would have been corporate murder. They have now issued an official notification, which is still illegal, arbitrary and capricious, but is scoped narrowly and won’t be too disruptive. Nominally the SCR designation is because we cannot rely on that same product when the company has not bent the knee and might object to some uses of its private property that it never agreed to allow. No one actually believes this. No one is pretending others should believe this. If they have real concerns, there are numerous less restrictive and less disruptive tools available to the Department of War. Many have the bonus of being legal. In actuality [...] --- Outline: (05:01) Post Overview (07:26) Anthropics Statement on the SCR (11:24) What The Actual SCR Designation Says (14:40) Enemies of The Republic (29:43) Regulation Need Not Seize The Means Of Production (31:53) Microsoft Stands Firm (33:03) Calling This What It Is (34:22) What To Expect Next --- First published: March 6th, 2026 Source: https://www.lesswrong.com/posts/EL8uxnWMEZXc7Wh9A/anthropic-officially-arbitrarily-and-capriciously-designated --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

36 min
5 DE MARÇ

“AI #158: The Department of War” by Zvi

This was the worst week I have had in quite a while, maybe ever. The situation between Anthropic and the Department of War (DoW) spun completely out of control. Trump tried to de-escalate by putting out a Truth merely banning Anthropic from direct use by the Federal Government with a six month wind down. Then Secretary of War Hegseth went rogue and declared Anthropic a supply chain risk, with wording indicating an intent to outright murder Anthropic as a company. Then that evening OpenAI signed a contact with DoW, I’ve been trying to figure out the situation and help as best I can. I’ve been in a lot of phone calls, often off the record. Conduct is highly unbecoming and often illegal, arbitrary and capricious. The house is on fire, the Republic in peril. I have people lying to me and being lied to by others. There is fog of war. One gets it from all sides. It's terrifying to think about what might happen with one wrong move. Also the Middle East is kind of literally on fire, which I’m not covering. Last week, I had previously covered the situation in Anthropic and [...] --- Outline: (04:17) A Well Deserved Break (05:53) Huh, Upgrades (07:40) On Your Marks (07:59) Choose Your Fighter (08:55) Deepfaketown and Botpocalypse Soon (10:07) A Young Ladys Illustrated Primer (11:48) You Drive Me Crazy (12:34) They Took Our Jobs (12:50) The Art of the Jailbreak (13:27) Introducing (14:04) In Other AI News (15:29) Show Me the Money (16:51) Quiet Speculations (17:39) The Quest for Sane Regulations (19:25) Chip City (20:26) The Week in Audio (20:39) Government Rhetorical Innovation (22:53) Give The People What They Want (24:28) Rhetorical Innovation (34:23) We Go Our Separate Ways (35:21) Thanks For The Memos (48:10) Take A Moment (48:37) Designating Anthropic A Supply Chain Risk Wont Legally Work (50:18) The Buck Stops Here (57:23) Sane Talk About the Department of War Situation (01:06:18) I Declare Defense Production Act (01:09:49) Greg Allen Illustrates The Situation (01:18:33) Do Not Lend Your Strength To That Which You Wish To Be Free From (01:23:56) Oh Right Democrats Exist (01:24:46) Beware (01:26:16) Endorsements of Anthropic Holding the Moral Line (01:29:08) The Week The World Learned About Claude (01:30:42) Other Reflections on the Department of War Situation (01:31:45) Aligning a Smarter Than Human Intelligence is Difficult (01:32:30) The Lighter Side --- First published: March 5th, 2026 Source: https://www.lesswrong.com/posts/YTnzcZSbA69fMCjNo/ai-158-the-department-of-war --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

1 h 37 min
4 DE MARÇ

“Gemini 3.1 Pro Aces Benchmarks, I Suppose” by Zvi

I’ve been trying to find a slot for this one for a while. I am thrilled that today had sufficiently little news that I am comfortable posting this. Gemini 3.1 scores very well on benchmarks, but most of us had the same reaction after briefly trying it: “It's a Gemini model.” And that was that, given our alternatives. But it's got its charms. Consider this a nice little, highly skippable break. The Pitch It's a good model, sir. That's the pitch. Sundar Pichai (CEO Google): Gemini 3.1 Pro is here. Hitting 77.1% on ARC-AGI-2, it's a step forward in core reasoning (more than 2x 3 Pro). With a more capable baseline, it's great for super complex tasks like visualizing difficult concepts, synthesizing data into a single view, or bringing creative projects to life. We’re shipping 3.1 Pro across our consumer and developer products to bring this underlying leap in intelligence to your everyday applications right away. Jeff Dean also highlighted ARC-AGI-2 along with some cool animations, an urban planning sim, some heat transfer analysis and the general benchmarks. On Your Marks Google presents a good standard set of [...] --- Outline: (00:37) The Pitch (01:31) On Your Marks (04:34) Other Peoples Benchmarks (06:54) Gemini 3 DeepThink V2 (12:33) Positive Feedback (17:22) Negative Feedback (19:07) Try Gemini Lite --- First published: March 4th, 2026 Source: https://www.lesswrong.com/posts/82zizPyyPgaEswbxz/gemini-3-1-pro-aces-benchmarks-i-suppose --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

20 min
3 DE MARÇ

“A Tale of Three Contracts” by Zvi

The attempt on Friday by Secretary of War Pete Hegsted to label Anthropic as a supply chain risk and commit corporate murder had a variety of motivations. On its face, the conflict is a tale of three contracts and the associated working relationships. The contract Anthropic signed with the Department of War (DoW) in 2025. The new contract Anthropic was negotiating with DoW, that would have been modified to favor DoW, but where the parties could not reach agreement. The contract OpenAI was negotiating and signed with DoW, which was per OpenAI modified favorably to OpenAI and thus may be modified further. The contracts and negotiations need to be confidential, so we only have limited details, and especially only limited details have been shared in public. We do know a lot, and we know a lot more than we did yesterday morning. This post is what we know about those three contracts. For further details and sources, and in particular for a more detail-oriented smackdown on a variety of false or misleading claims and takes, see the long version from yesterday. That post uses very careful qualifiers for [...] --- Outline: (01:58) The Original Anthropic Contract With DoW (06:29) The Proposed Revisions to Anthropics Contract (12:51) OpenAIs Contract With DoW --- First published: March 3rd, 2026 Source: https://www.lesswrong.com/posts/PBrggrw4mhgbksoYY/a-tale-of-three-contracts --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

32 min
2 DE MARÇ

“Secretary of War Tweets That Anthropic is Now a Supply Chain Risk” by Zvi

This is the long version of what happened so far. I will strive for shorter ones later, when I have the time to write them. Most of you should read the first two sections, then choose the remaining sections that are relevant to your interests. But first, seriously, read Dean Ball's post Clawed. Do that first. I will not quote too extensively from it, because I am telling all of you to read it. Now. You’re not allowed to keep reading this or anything else until after you do. I’m not kidding. That's out of the way? Good. Let's get started. What Happened President Trump enacted a perfectly reasonable solution to the situation with Anthropic and the Department of War. He cancelled the Anthropic contract with a six month wind down period, after which the Federal Government would be told not to use Anthropic software. Everyone thought the worst was now over. The situation was unfortunate for Anthropic and also for national security, but this gave us six months to transition, it gave us six months to negotiate another solution, and it avoided any of the extreme highly damaging options that Secretary of [...] --- Outline: (00:49) What Happened (10:02) The Timeline Of Events (21:39) I Did Not Have Time To Write You A Short One (22:40) The Unhinged Declaration of the Secretary of War (25:08) Altman Has Been Excellent On The Question of Supply Chain Risk, But May Need To Do More (27:35) Arrogance Here Means Insisting On Meaningful Red Lines On Mass Domestic Surveillance and Lethal Autonomous Weapons (29:35) Not Doing Business Is Totally Fine (30:22) The Demand For Unrestricted Access Is New And Is Selective And Fake (32:47) Claims Of Strongarming Are Ad Hominem Bad Faith Obvious Nonsense (34:55) Hegseth Equates Not Being a Dictator With Companies Having Veto Power Over Operational Military Decisions (40:54) The Part That If Enacted Would Be A Historically Epic Clusterfuck (48:54) The Other Part Of The Clusterfuck (52:49) The Department of War Had Many Excellent Options (55:53) And Then Theres Emil Michael (01:02:42) Anthropic Will Probably Survive (01:05:14) The Goal of DoW Was Largely Mass Domestic Surveillance (01:15:32) What Are The Key Differences Between The Two Contracts? (01:22:54) OpenAIs Contract Terms (01:27:04) What OpenAIs Contract Terms Actually Do (01:29:16) OpenAI Is Trusting DoW And Sam Altman Misrepresented This (01:33:13) OpenAI Accepted Terms Anthropic Explicitly Declined And That Would Not Have Protected Anthropics Red Lines (01:35:26) How Altman Initially Described His Deal (01:42:26) OpenAI Allowed All Lawful Use And Trusts DoW On This (01:47:21) The DoW Could Alter This Deal (01:49:29) Why OpenAIs Shared Legal Language Offers Almost No Protections (01:59:11) So How Does OpenAI Hope For This To Work Out? (02:01:56) This Was Never About Money (02:05:03) OpenAI Tells Us How They Really Feel (02:06:45) First The Good News (02:11:40) The OpenAI Redlines Only Forbid Currently Illegal Activity (02:16:28) Altman Does Not Present As Understanding The Difference In Redlines (02:18:46) Meeting Of The Minds (02:21:21) Anthropics Position Was The Opposite Of How This Is Portrayed (02:21:56) The Room Where It Happened (02:25:45) You Dont Have The Right (02:32:25) I Ask Questions And Get Answers (02:37:05) Does This Contract Apply To NSA? (02:38:07) Can OpenAI Models Be Used To Analyze Commercially Available Data At Scale? (02:48:21) Employee Activism --- First published: March 2nd, 2026 Source: https://www.lesswrong.com/posts/Wpdivf3iNJDzBcbzJ/secretary-of-war-tweets-that-anthropic-is-now-a-supply-chain --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

2 h 53 min

Veure-ho tot (250)

Audio narrations of LessWrong posts by zvi

Creació

zvi
Anys en actiu

2024 - 2026
Episodis

250
Qualificació

Explícit
Lloc web del programa

LessWrong posts by zvi

Tecnologia

Tecnologia

Setmanal
Tecnologia

Tecnologia

Setmanal
Tecnologia

Tecnologia

Diari
Educació

Educació

Quinzenal
Cursos

Cursos

Diari
Inversions

Inversions

26 de febr.
Tecnologia

Tecnologia

Setmanal

LessWrong posts by zvi

“AI #159: See You In Court” by Zvi

“GPT-5.4 Is A Substantial Upgrade” by Zvi

“Claude Code, Claude Cowork and Codex #5” by Zvi

“Anthropic Officially, Arbitrarily and Capriciously Designated a Supply Chain Risk” by Zvi

“AI #158: The Department of War” by Zvi

“Gemini 3.1 Pro Aces Benchmarks, I Suppose” by Zvi

“A Tale of Three Contracts” by Zvi

“Secretary of War Tweets That Anthropic is Now a Supply Chain Risk” by Zvi

Informació

Fitxa tècnica

També et pot agradar

LessWrong posts by zvi

Episodis

“AI #159: See You In Court” by Zvi

“GPT-5.4 Is A Substantial Upgrade” by Zvi

“Claude Code, Claude Cowork and Codex #5” by Zvi

“Anthropic Officially, Arbitrarily and Capriciously Designated a Supply Chain Risk” by Zvi

“AI #158: The Department of War” by Zvi

“Gemini 3.1 Pro Aces Benchmarks, I Suppose” by Zvi

“A Tale of Three Contracts” by Zvi

“Secretary of War Tweets That Anthropic is Now a Supply Chain Risk” by Zvi

Informació

Fitxa tècnica

També et pot agradar