LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. HACE 1 H

    “The other paper that killed deep learning theory” by LawrenceC

    Yesterday, I wrote about the state of deep learning theory circa 2016,[1] as well as the bombshell 2016 paper that arguably signaled its demise, Zhang et al.'s Understanding deep learning requires rethinking generalization. As a brief summary, I argued that the rise of deep learning posed an existential challenge to the dominant theoretical paradigm of statistical learning theory, because neural networks have a lot of complexity. The response from the field was to attempt to quantify other ways in which the hypothesis class of neural networks in practice was simple, using alternative metrics of complexity. Zhang et al. 2016 showed that the standard neural network architectures trained with standard training methods could memorize large quantities of random labelled data, which showed that no such argument could explain the generalization properties of neural networks. Today we’re going to look at the aftermath: how did the field of deep learning theory react to this paper? What were the attempts to get around this result using data-dependent generalization bounds? And why did Nagarajan and Kolter's humbly titled Uniform convergence may be unable to explain generalization in deep learning serve as the proverbial final nail in the coffin of this line of work? [...] The original text contained 4 footnotes which were omitted from this narration. --- First published: April 26th, 2026 Source: https://www.lesswrong.com/posts/zcGmdQHX66NhC69v6/the-other-paper-that-killed-deep-learning-theory --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    11 min
  2. HACE 7 H

    “What holds AI safety together? Co-authorship networks from 200 papers” by Anna Thieser

    We (social science PhD students) computed co-authorship networks based on a corpus of 200 AI safety papers covering 2015-2025, and we’d like your help checking if the underlying dataset is right. Co-authorship networks make visible the relative prominence of entities involved in AI safety research, and trace relationships between them. Although frontier labs produce lots of research, they remain surprisingly insular — universities dominate centrality in our graphs. The network is held together by a small group of multiply affiliated researchers, often switching between academia and industry mid-career. To us, AI safety looks less like a unified field and more like a trading zone where institutions from different sectors exchange knowledge, financial resources, compute and legitimacy without encroaching on each other's autonomy. Of course, these visualizations are only as good as the corpus underlying them, because the shape of the network is sensitive to what's included. Here's what it currently looks like showing co-authorship at the individual level: There are 2 details boxes here, which are omitted from this narration. The boxes have the titles "Figure 1: Methods" and "Figure 1: Color legend". While academic and for-profit authors occupy distinct clusters, over 95% of nodes are part of the [...] --- First published: April 24th, 2026 Source: https://www.lesswrong.com/posts/c7D2q7k97QcwCxBrN/what-holds-ai-safety-together-co-authorship-networks-from --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    6 min
  3. HACE 10 H

    ″“Bad faith” means intentionally misrepresenting your beliefs” by TFD

    The confusion I recently came upon a comment which I believe reflects a persistent confusion among rationalist/EA types. I was reading this post which contains ideas that the other has but doesn't have time to write posts about. One of those relates to the concept of "good faith", labelled "most arguments are not in good faith, of course": Look, I love good faith discourse (here meaning "conversation in which the primary goal of all participants is to help other participants and onlookers arrive at true beliefs"). The definition given for "good faith discourse" seems incorrect to me, and it's not a close call in my opinion. The level of incorrectness in my view is something like saying "I like people who obey the law (here meaning never committing a social faux pas)". This isn't the first time I have seen someone in this community advance a similar view on the meaning of good/bad faith. For example, this post. I thought it might be useful to bring this apparent disagreement to the foreground, so I will lay out my belief about what this concept means. I suspect this disagreement may also involve an underlying about what [...] --- Outline: (00:11) The confusion (01:33) My definition (01:48) General meaning (01:59) Good faith discourse (02:25) Evidence (02:40) Sources (07:27) Classic usage (07:36) Good faith errors or mistakes (09:32) Good faith negotiation (10:59) Conclusion --- First published: April 26th, 2026 Source: https://www.lesswrong.com/posts/usnJtFsJq5seN4aJJ/bad-faith-means-intentionally-misrepresenting-your-beliefs --- Narrated by TYPE III AUDIO.

    11 min
  4. HACE 12 H

    “Retrospective on my unsupervised elicitation challenge” by DanielFilan

    This post contains spoilers for the unsupervised elicitation challenge of getting Claude to get my Ancient Greek homework right. tl;dr Opus 4.7 one-shots it, nothing else worked. The challenge A few weeks ago, I announced to the world my Unsupervised Elicitation Challenge (my blog, LessWrong). I’d encourage you to read that post for the context, but the tl;dr is that there was a fill-in-the-blank exercise early on in my Ancient Greek textbook that Claude Opus 4.6 didn’t fill out correctly by default, but could do correctly if I prodded it a bit. The challenge was to get it to fill out the answers correctly without knowing any Ancient Greek yourself—after all, Opus 4.6 apparently has this knowledge somewhere internally (as you might expect, given that it's a large language model that has presumably read the whole corpus of Ancient Greek as well as many textbooks on the topic), but I was only able to extract it out because I knew what to ask about. The general idea of the challenge is to mimic a hard version of AI alignment, in some sense: suppose that there's some task you want an AI to complete, but can’t [...] --- Outline: (00:23) The challenge (02:11) The secret: accents (04:42) Is this unfair? (05:46) Nobody succeeded (07:57) What this says about alignment (08:38) The problem of Opus 4.7 (10:47) Next steps for unsupervised elicitation The original text contained 8 footnotes which were omitted from this narration. --- First published: April 26th, 2026 Source: https://www.lesswrong.com/posts/aJ3NTJ6tM7yc49kQn/retrospective-on-my-unsupervised-elicitation-challenge --- Narrated by TYPE III AUDIO.

    13 min
  5. HACE 17 H

    “Control protocols don’t always need to know which models are scheming” by Fabien Roger

    These are my personal views. To detect if an agent is taking a catastrophically dangerous action, you might want to monitor its actions using the smartest model that is too weak to be a schemer. But knowing what models are weak enough that they are unlikely to scheme is difficult, which puts you in a difficult spot: take a model too strong and it might actually be a schemer and lie to you, take a model too weak and it might fail just because it's too dumb to notice the danger. So instead you can just use all models as monitors at once and block actions if any of the monitors flag it as dangerous[1]. In some situations, this lets you build protocols that you can implement without knowing which models are scheming and that get the same level of safety as if you perfectly knew which models were scheming (though it doesn’t let you know what level of safety you actually have). This post presents some toy models illustrating the benefits of ensembling monitors of various trustedness and explains what consequences it might have in practice to be using protocols that are somewhat agnostic about which of the [...] --- Outline: (01:30) Toy models (01:33) The n=2 case (03:52) Continuum of models (04:50) Ensembling monitors in practice The original text contained 1 footnote which was omitted from this narration. --- First published: April 26th, 2026 Source: https://www.lesswrong.com/posts/WiYDawNhFp5cM7uqF/control-protocols-don-t-always-need-to-know-which-models-are --- Narrated by TYPE III AUDIO.

    11 min
  6. HACE 22 H

    “Anthropic spent too much don’t-be-annoying capital on Mythos” by draganover

    I have seen a lot of coverage from reasonable people suggesting that Claude's new model, Mythos, is a vehicle for Anthropic to peddle hype and doom in order to raise money. While some of this is necessarily motivated by people's unwillingness to stare into the abyss of our AI future, the breadth of otherwise-reasonable people who have voiced these kinds of cynical opinions suggests that part of the blame rests on Anthropic. In this post, I want to briefly unpack the way people misinterpreted the evidence, their valid reasons for doing so, and what Anthropic (and the AI safety community more broadly) should learn from this. In particular, since we will inevitably see other dangerous capabilities spontaneously emerge in the future, we need protocols for how to announce them effectively. To this end, I try to make the following points: We should be mindful of the public's sympathy for AI safety prophecies and orient ourselves towards growing this resource rather than expending it. I call this our don't-be-annoying capital.People have good reasons to be skeptical of Anthropic's claims. Overcoming this requires a particularly high burden of evidence.We should acknowledge that Anthropic has a conflict of interest [...] --- Outline: (01:27) 1. Mythos criticisms & missing the point (03:04) 2. It feels like people wanted to miss the point? (04:19) 3. There must be a lesson here. (05:59) 4. Some options for next time The original text contained 1 footnote which was omitted from this narration. --- First published: April 26th, 2026 Source: https://www.lesswrong.com/posts/eSxvdX43d4rC4wAsw/anthropic-spent-too-much-don-t-be-annoying-capital-on-mythos --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    10 min
  7. HACE 1 DÍA

    “The paper that killed deep learning theory” by LawrenceC

    Around 10 years ago, a paper came out that arguably killed classical deep learning theory: Zhang et al. 's aptly titled Understanding deep learning requires rethinking generalization. Of course, this is a bit of an exaggeration. No single paper ever kills a field of research on its own, and deep learning theory was not exactly the most productive and healthy field at the time this was published. But if I had to point to a single paper that shattered the feeling of optimism at the time, it would be Zhang et al. 2016.[1] Caption: believe it or not, this unassuming table rocked the field of deep learning theory back in 2016, despite probably involving fewer computational resources than what Claude 4.7 Opus consumed when I clicked the “Claude” button embedded into the LessWrong editor. — Let's start by answering a question: what, exactly, do I mean by deep learning theory? At least in 2016, the answer was: “extending statistical learning theory to deep neural networks trained with SGD, in order to derive generalization bounds that would explain their behavior in practice”. — Since its conception in the mid 1980s, statistical learning theory had been the dominant approach for [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: April 25th, 2026 Source: https://www.lesswrong.com/posts/ZvQfcLbcNHYqmvWyo/the-paper-that-killed-deep-learning-theory --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    11 min

Acerca de

Audio narrations of LessWrong posts.

También te podría interesar