LessWrong (30+ Karma) LessWrong
-
- Technology
Audio narrations of LessWrong posts.
-
“Intransitive Trust” by Screwtape
1.
"Transitivity" is a property in mathematics and logic. Put simply, if something is transitive it means that there's a relationship between things where when x relates to y, and y relates to z, there's the same relationship between x and z. For a more concrete example, think of size. If my car is bigger than my couch, and my couch is bigger than my hat, you know that my car is bigger than my hat.
(I am not a math major, and if there's a consensus in the comments that I'm using the wrong term here I can update the post.)
This is a neat property. Lots of things do not have it.
2.
Consider the following circumstance: Bob is traveling home one night, late enough there isn't anyone else around. Bob sees a shooting star growing unusually bright, until it resolves into a disc-shaped machine with [...]
---
Outline:
(00:03) 1.
(00:43) 2.
(06:57) 3.
(11:28) 4.
(13:43) 5.
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
May 27th, 2024
Source:
https://www.lesswrong.com/posts/zKEdphnEycdCJeq8f/intransitive-trust
---
Narrated by TYPE III AUDIO. -
“Book review: Everything Is Predictable” by PeterMcCluskey
This is a link post.Book review: Everything Is Predictable: How Bayesian Statistics Explain
Our World, by Tom Chivers.
Many have attempted to persuade the world to embrace a Bayesian
worldview, but none have succeeded in reaching a broad audience.
E.T. Jaynes'
book
has been a leading example, but its appeal is limited to those who find
calculus enjoyable, making it unsuitable for a wider readership.
Other attempts to engage a broader audience often focus on a narrower
understanding, such as Bayes'
Theorem, rather than the
complete worldview.
Claude's most fitting recommendation was Rationality: From AI to
Zombies, but at 1,813 pages, it's
too long and unstructured for me to comfortably recommend to most
readers. (GPT-4o's suggestions were less helpful, focusing only on
resources for practical problem-solving).
Aubrey Clayton's book, Bernoulli's Fallacy: Statistical Illogic and
the Crisis of Modern
Science,
only came to my attention [...]
---
Outline:
(01:25) Basics
(02:24) The Replication Crisis
(03:47) Minds Approximate Bayes
(04:05) Concluding Thoughts
---
First published:
May 27th, 2024
Source:
https://www.lesswrong.com/posts/DcEThyBPZfJvC5tpp/book-review-everything-is-predictable-1
---
Narrated by TYPE III AUDIO. -
“I am the Golden Gate Bridge” by Zvi
Easily Interpretable Summary of New Interpretability Paper
Anthropic has identified (full paper here) how millions of concepts are represented inside Claude Sonnet, their current middleweight model. The features activate across modalities and languages as tokens approach the associated context. This scales up previous findings from smaller models.
By looking at neuron clusters, they defined a distance measure between clusters. So the Golden Gate Bridge is close to various San Francisco and California things, and inner conflict relates to various related conceptual things, and so on.
Then it gets more interesting.
Importantly, we can also manipulate these features, artificially amplifying or suppressing them to see how Claude's responses change.
If you sufficiently amplify the feature for the Golden Gate Bridge, Claude starts to think it is the Golden Gate Bridge. As in, it thinks it is the physical bridge, and also it gets obsessed, bringing [...]
---
Outline:
(00:03) Easily Interpretable Summary of New Interpretability Paper
(02:59) One Weird Trick
(05:27) Zvi Parses the Actual Symbol Equations
(08:06) Identifying and Verifying Features
(12:38) Features as Computational Intermediates
(13:28) Oh That's the Deception Feature, Nothing to Worry About
(18:14) What Do They Think This Mean for Safety?
(19:10) Limitations
(22:28) Researcher Perspectives
(23:24) Other Reactions
(24:19) I Am the Golden Gate Bridge
(26:06) Golden Gate Bridges Offer Mundane Utility
(27:39) The Value of Steering
(31:54) To What Extent Did We Know This Already?
(43:07) Is This Being Oversold?
(47:07) Crossing the Bridge Now That We’ve Come to It
---
First published:
May 27th, 2024
Source:
https://www.lesswrong.com/posts/JdcxDEqWKfsucxYrk/i-am-the-golden-gate-bridge
---
Narrated by TYPE III AUDIO. -
“Maybe Anthropic’s Long-Term Benefit Trust is powerless” by Zach Stein-Perlman
Crossposted from AI Lab Watch. Subscribe on Substack.
Introduction. Anthropic has an unconventional governance mechanism: an independent "Long-Term Benefit Trust" elects some of its board. Anthropic sometimes emphasizes that the Trust is an experiment, but mostly points to it to argue that Anthropic will be able to promote safety and benefit-sharing over profit.[1]
But the Trust's details have not been published and some information Anthropic has shared is concerning. In particular, Anthropic's stockholders can apparently overrule, modify, or abrogate the Trust, and the details are unclear.
Anthropic has not publicly demonstrated that the Trust would be able to actually do anything that stockholders don't like.
The facts
There are three sources of public information on the Trust:
The Long-Term Benefit Trust (Anthropic 2023)Anthropic Long-Term Benefit Trust (Morley et al. 2023)The $1 billion gamble to ensure AI doesn't destroy humanity (Vox: Matthews 2023)They say there's [...]
---
Outline:
(00:53) The facts
(02:50) Conclusion
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
May 27th, 2024
Source:
https://www.lesswrong.com/posts/sdCcsTt9hRpbX6obP/maybe-anthropic-s-long-term-benefit-trust-is-powerless
---
Narrated by TYPE III AUDIO. -
“Computational Mechanics Hackathon (June 1 & 2)” by Adam Shai
Join our Computational Mechanics Hackathon, organized with the support of APART, PIBBSS and Simplex.
This is an opportunity to learn more about Computational Mechanics, its applications to AI interpretability & safety, and to get your hands dirty by working on a concrete project together with a team and supported by Adam & Paul. Also, there will be cash prizes for the best projects!
Read more and sign up for the event here.
We’re excited about Computational Mechanics as a framework because it provides a rigorous notion of structure that can be applied to both data and model internals. In, Transformers Represent Belief State Geometry in their Residual Stream , we validated that Computational Mechanics can help us understand fundamentally what computational structures transformers implement when trained on next-token prediction - a belief updating process over the hidden structure of the data generating process. We then [...]
---
First published:
May 24th, 2024
Source:
https://www.lesswrong.com/posts/tkEQKrqZ6PdYPCD8F/computational-mechanics-hackathon-june-1-and-2
---
Narrated by TYPE III AUDIO. -
“Review: Conor Moreton’s ‘Civilization & Cooperation’” by [DEACTIVATED] Duncan Sabien
Author's note: in honor of the upcoming LessOnline event, I'm sharing this one here on LessWrong rather than solely on my substack. If you like it, you should subscribe to my substack, which you can do for free (paid subscribers see stuff a week early). I welcome discussion down below but am not currently committing to participating myself.
Dang it, I knew I should have gone with my first instinct, and photocopied the whole book first. But then again, given that it vanished as soon as I got to the end of it, maybe my second instinct was right, and trying to do that would’ve been seen as cheating by whatever magical librarians left it for me in the first place.
It was just sitting there, on my desk, when I woke up six weeks ago. At first I thought it was an incredibly in-depth prank, or maybe [...]
---
Outline:
(02:46) I. Civilization as self-restraint
(07:12) II. Orbits
(12:18) III. Purchasing breathing room
(21:43) IV. Lopsided possibility trees (or, the ecology metaphor)
(30:26) V. The evolutionary metaphor
(41:49) VI. Pressures toward savagery
(53:48) Interlude: The Veil of Ignorance
(01:02:37) VII.
---
First published:
May 26th, 2024
Source:
https://www.lesswrong.com/posts/5acACJQjnA7KAHNpT/review-conor-moreton-s-civilization-and-cooperation
---
Narrated by TYPE III AUDIO.