LessWrong (Curated & Popular)

LessWrong

0,0 (0)
TECNOLOGIA
OGNI SETTIMANA

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

7 H FA

“Condensation” by abramdemski

Condensation: a theory of concepts is a model of concept-formation by Sam Eisenstat. Its goals and methods resemble John Wentworth's natural abstractions/natural latents research.[1] Both theories seek to provide a clear picture of how to posit latent variables, such that once someone has understood the theory, they'll say "yep, I see now, that's how latent variables work!". The goal of this post is to popularize Sam's theory and to give my own perspective on it; however, it will not be a full explanation of the math. For technical details, I suggest reading Sam's paper. Brief Summary Shannon's information theory focuses on the question of how to encode information when you have to encode everything. You get to design the coding scheme, but the information you'll have to encode is unknown (and you have some subjective probability distribution over what it will be). Your objective is to minimize the total expected code-length. Algorithmic information theory similarly focuses on minimizing the total code-length, but it uses a "more objective" distribution (a universal algorithmic distribution), and a fixed coding scheme (some programming language). This allows it to talk about the minimum code-length of specific data (talking about particulars rather than average [...] --- Outline: (00:45) Brief Summary (02:35) Shannons Information Theory (07:21) Universal Codes (11:13) Condensation (12:52) Universal Data-Structure? (15:30) Well-Organized Notebooks (18:18) Random Variables (18:54) Givens (19:50) Underlying Space (20:33) Latents (21:21) Contributions (21:39) Top (22:24) Bottoms (22:55) Score (24:29) Perfect Condensation (25:52) Interpretability Solved? (26:38) Condensation isnt as tight an abstraction as information theory. (27:40) Condensation isnt a very good model of cognition. (29:46) Much work to be done! The original text contained 15 footnotes which were omitted from this narration. --- First published: November 9th, 2025 Source: https://www.lesswrong.com/posts/BstHXPgQyfeNnLjjp/condensation --- Narrated by TYPE III AUDIO.

30 min
1 G FA

“Mourning a life without AI” by Nikola Jurkovic

Recently, I looked at the one pair of winter boots I own, and I thought “I will probably never buy winter boots again.” The world as we know it probably won’t last more than a decade, and I live in a pretty warm area. I. AGI is likely in the next decade It has basically become consensus within the AI research community that AI will surpass human capabilities sometime in the next few decades. Some, including myself, think this will likely happen this decade. II. The post-AGI world will be unrecognizable Assuming AGI doesn’t cause human extinction, it is hard to even imagine what the world will look like. Some have tried, but many of their attempts make assumptions that limit the amount of change that will happen, just to make it easier to imagine such a world. Dario Amodei recently imagined a post-AGI world in Machines of Loving Grace. He imagines rapid progress in medicine, the curing of mental illness, the end of poverty, world peace, and a vastly transformed economy where humans probably no longer provide economic value. However, in imagining this crazy future, he limits his writing to be “tame” enough to be digested by a [...] --- Outline: (00:22) I. AGI is likely in the next decade (00:40) II. The post-AGI world will be unrecognizable (03:08) III. AGI might cause human extinction (04:42) IV. AGI will derail everyone's life plans (06:51) V. AGI will improve life in expectation (08:09) VI. AGI might enable living out fantasies (09:56) VII. I still mourn a life without AI --- First published: November 8th, 2025 Source: https://www.lesswrong.com/posts/jwrhoHxxQHGrbBk3f/mourning-a-life-without-ai --- Narrated by TYPE III AUDIO. --- Images from the article:

11 min
2 GG FA

“Unexpected Things that are People” by Ben Goldhaber

Cross-posted from https://bengoldhaber.substack.com/ It's widely known that Corporations are People. This is universally agreed to be a good thing; I list Target as my emergency contact and I hope it will one day be the best man at my wedding. But there are other, less well known non-human entities that have also been accorded the rank of person. Ships: Ships have long posed a tricky problem for states and courts. Similar to nomads, vagabonds, and college students on extended study abroad, they roam far and occasionally get into trouble. classic junior year misadventure If, for instance, a ship attempting to dock at a foreign port crashes on its way into the harbor, who pays? The owner might be a thousand miles away. The practical solution that medieval courts arrived at, and later the British and American admiralty, was the ship itself does. Ships are accorded limited legal person rights, primarily so that they can be impounded and their property seized if they do something wrong. In the eyes of the Law they are people so that they can later be defendants; their rights are constrained to those associated with due process, like the right to post a bond and [...] --- First published: November 8th, 2025 Source: https://www.lesswrong.com/posts/fB5pexHPJRsabvkQ2/unexpected-things-that-are-people --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

8 min
6 GG FA

“Sonnet 4.5’s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals” by Alexa Pan, ryan_greenblatt

According to the Sonnet 4.5 system card, Sonnet 4.5 is much more likely than Sonnet 4 to mention in its chain-of-thought that it thinks it is being evaluated; this seems to meaningfully cause it to appear to behave better in alignment evaluations. So, Sonnet 4.5's behavioral improvements in these evaluations may partly be driven by growing tendency to notice and game evaluations rather than genuine alignment. This is an early example of a phenomenon that is going to get increasingly problematic: as evaluation gaming increases, alignment evaluations become harder to trust.[1] To elaborate on the above: Sonnet 4.5 seems far more aware of being evaluated than previous models. In some of Anthropic's evaluations, Sonnet 4.5 explicitly mentions being tested between 80% and 100% of the time, compared with under 10% for Sonnet 4. Internal representations associated with evaluation awareness also became notably stronger over training.[2] In addition, Sonnet 4.5 seems substantially less likely to behave poorly once it has noticed that it might be tested. As seen in the figure below, when Anthropic inhibited internal representations related to evaluation awareness[3], its misaligned behavior rate on synthetic honeypot tests increased from 0% to between 1% and 9%. This effect was [...] --- Outline: (06:58) Sonnet 4.5 is much more evaluation-aware than prior models (10:00) Evaluation awareness seems to suppress misaligned behavior (14:52) Anthropic's training plausibly caused Sonnet 4.5 to game evaluations (16:28) Evaluation gaming is plausibly a large fraction of the effect of training against misaligned behaviors (22:57) Suppressing evidence of misalignment in evaluation gamers is concerning (25:25) What AI companies should do (30:02) Appendix The original text contained 21 footnotes which were omitted from this narration. --- First published: October 30th, 2025 Source: https://www.lesswrong.com/posts/qgehQxiTXj53X49mM/sonnet-4-5-s-eval-gaming-seriously-undermines-alignment --- Narrated by TYPE III AUDIO. --- Images from the article:

36 min
6 GG FA

“Publishing academic papers on transformative AI is a nightmare” by Jakub Growiec

I am a professor of economics. Throughout my career, I was mostly working on economic growth theory, and this eventually brought me to the topic of transformative AI / AGI / superintelligence. Nowadays my work focuses mostly on the promises and threats of this emerging disruptive technology. Recently, jointly with Klaus Prettner, we’ve written a paper on “The Economics of p(doom): Scenarios of Existential Risk and Economic Growth in the Age of Transformative AI”. We have presented it at multiple conferences and seminars, and it was always well received. We didn’t get any real pushback; instead our research prompted a lot of interest and reflection (as I was reported, also in conversations where I wasn’t involved). But our experience with publishing this paper in a journal is a polar opposite. To date, the paper got desk-rejected (without peer review) 7 times. For example, Futures—a journal “for the interdisciplinary study of futures, visioning, anticipation and foresight” justified their negative decision by writing: “while your results are of potential interest, the topic of your manuscript falls outside of the scope of this journal”. Until finally, to our excitement, it was for once sent out for review. But then came the [...] --- First published: November 3rd, 2025 Source: https://www.lesswrong.com/posts/rmYj6PTBMm76voYLn/publishing-academic-papers-on-transformative-ai-is-a --- Narrated by TYPE III AUDIO.

7 min
6 GG FA

“The Unreasonable Effectiveness of Fiction” by Raelifin

[Meta: This is Max Harms. I wrote a novel about China and AGI, which comes out today. This essay from my fiction newsletter has been slightly modified for LessWrong.] In the summer of 1983, Ronald Reagan sat down to watch the film War Games, starring Matthew Broderick as a teen hacker. In the movie, Broderick's character accidentally gains access to a military supercomputer with an AI that almost starts World War III. “The only winning move is not to play.” After watching the movie, Reagan, newly concerned with the possibility of hackers causing real harm, ordered a full national security review. The response: “Mr. President, the problem is much worse than you think.” Soon after, the Department of Defense revamped their cybersecurity policies and the first federal directives and laws against malicious hacking were put in place. But War Games wasn't the only story to influence Reagan. His administration pushed for the Strategic Defense Initiative ("Star Wars") in part, perhaps, because the central technology—a laser that shoots down missiles—resembles the core technology behind the 1940 spy film Murder in the Air, which had Reagan as lead actor. Reagan was apparently such a superfan of The Day the Earth Stood Still [...] --- Outline: (05:05) AI in Particular (06:45) Whats Going On Here? (11:19) Authorial Responsibility The original text contained 10 footnotes which were omitted from this narration. --- First published: November 3rd, 2025 Source: https://www.lesswrong.com/posts/uQak7ECW2agpHFsHX/the-unreasonable-effectiveness-of-fiction --- Narrated by TYPE III AUDIO. --- Images from the article:

15 min
6 GG FA

“Legible vs. Illegible AI Safety Problems” by Wei Dai

Some AI safety problems are legible (obvious or understandable) to company leaders and government policymakers, implying they are unlikely to deploy or allow deployment of an AI while those problems remain open (i.e., appear unsolved according to the information they have access to). But some problems are illegible (obscure or hard to understand, or in a common cognitive blind spot), meaning there is a high risk that leaders and policymakers will decide to deploy or allow deployment even if they are not solved. (Of course, this is a spectrum, but I am simplifying it to a binary for ease of exposition.) From an x-risk perspective, working on highly legible safety problems has low or even negative expected value. Similar to working on AI capabilities, it brings forward the date by which AGI/ASI will be deployed, leaving less time to solve the illegible x-safety problems. In contrast, working on the illegible problems (including by trying to make them more legible) does not have this issue and therefore has a much higher expected value (all else being equal, such as tractability). Note that according to this logic, success in making an illegible problem highly legible is almost as good as solving [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: November 4th, 2025 Source: https://www.lesswrong.com/posts/PMc65HgRFvBimEpmJ/legible-vs-illegible-ai-safety-problems --- Narrated by TYPE III AUDIO.

3 min
4 NOV

“Lack of Social Grace is a Lack of Skill” by Screwtape

1. I have claimed that one of the fundamental questions of rationality is “what am I about to do and what will happen next?” One of the domains I ask this question the most is in social situations. There are a great many skills in the world. If I had the time and resources to do so, I’d want to master all of them. Wilderness survival, automotive repair, the Japanese language, calculus, heart surgery, French cooking, sailing, underwater basket weaving, architecture, Mexican cooking, functional programming, whatever it is people mean when they say “hey man, just let him cook.” My inability to speak fluent Japanese isn’t a sin or a crime. However, it isn’t a virtue either; If I had the option to snap my fingers and instantly acquire the knowledge, I’d do it. Now, there's a different question of prioritization; I tend to pick new skills to learn by a combination of what's useful to me, what sounds fun, and what I’m naturally good at. I picked up the basics of computer programming easily, I enjoy doing it, and it turned out to pay really well. That was an over-determined skill to learn. On the other [...] --- Outline: (00:10) 1. (03:42) 2. (06:44) 3. The original text contained 2 footnotes which were omitted from this narration. --- First published: November 3rd, 2025 Source: https://www.lesswrong.com/posts/NnTwbvvsPg5kj3BKq/lack-of-social-grace-is-a-lack-of-skill-1 --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images

11 min

Elenco completo (665)

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

Creatore

LessWrong
Anni di attività

2022 - 2025
Puntate

665
Classificazione

Contenuti adatti a tutti
Sito web del podcast

LessWrong (Curated & Popular)

Tecnologia

Tecnologia

Ogni settimana
Investimenti

Investimenti

Ogni settimana
Istruzione

Istruzione

Ogni 2 settimane
Tecnologia

Tecnologia

Ogni settimana
Scienze sociali

Scienze sociali

Ogni settimana
Scienze sociali

Scienze sociali

21 ott
Scienze naturali

Scienze naturali

Ogni 2 settimane

LessWrong (Curated & Popular)

“Condensation” by abramdemski

“Mourning a life without AI” by Nikola Jurkovic

“Unexpected Things that are People” by Ben Goldhaber

“Sonnet 4.5’s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals” by Alexa Pan, ryan_greenblatt

“Publishing academic papers on transformative AI is a nightmare” by Jakub Growiec

“The Unreasonable Effectiveness of Fiction” by Raelifin

“Legible vs. Illegible AI Safety Problems” by Wei Dai

“Lack of Social Grace is a Lack of Skill” by Screwtape

Descrizione

Informazioni

Potrebbero piacerti anche…

LessWrong (Curated & Popular)

Puntate

“Condensation” by abramdemski

“Mourning a life without AI” by Nikola Jurkovic

“Unexpected Things that are People” by Ben Goldhaber

“Sonnet 4.5’s eval gaming seriously undermines alignment evals, and this seems caused by training on alignment evals” by Alexa Pan, ryan_greenblatt

“Publishing academic papers on transformative AI is a nightmare” by Jakub Growiec

“The Unreasonable Effectiveness of Fiction” by Raelifin

“Legible vs. Illegible AI Safety Problems” by Wei Dai

“Lack of Social Grace is a Lack of Skill” by Screwtape

Descrizione

Informazioni

Potrebbero piacerti anche…