LessWrong (Curated & Popular)

LessWrong

٤٫٨ (١٢)
التكنولوجيا
يتم التحديث أسبوعيًا

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

قبل يوم واحد

“The Most Common Bad Argument In These Parts” by J Bostock

I've noticed an antipattern. It's definitely on the dark pareto-frontier of "bad argument" and "I see it all the time amongst smart people". I'm confident it's the worst, common argument I see amongst rationalists and EAs. I don't normally crosspost to the EA forum, but I'm doing it now. I call it Exhaustive Free Association. Exhaustive Free Association is a step in a chain of reasoning where the logic goes "It's not A, it's not B, it's not C, it's not D, and I can't think of any more things it could be!"[1] Once you spot it, you notice it all the damn time. Since I've most commonly encountered this amongst rat/EA types, I'm going to have to talk about people in our community as examples of this. Examples Here's a few examples. These are mostly for illustrative purposes, and my case does not rely on me having found [...] --- Outline: (00:55) Examples (01:08) Security Mindset (01:25) Superforecasters and AI Doom (02:14) With Apologies to Rethink Priorities (02:45) The Fatima Sun Miracle (03:14) Bad Reasoning is Almost Good Reasoning (05:09) Arguments as Soldiers (06:29) Conclusion (07:04) The Counter-Counter Spell The original text contained 2 footnotes which were omitted from this narration. --- First published: October 11th, 2025 Source: https://www.lesswrong.com/posts/arwATwCTscahYwTzD/the-most-common-bad-argument-in-these-parts --- Narrated by TYPE III AUDIO.

٨ من الدقائق
قبل يومين

“Towards a Typology of Strange LLM Chains-of-Thought” by 1a3orn

Intro LLMs being trained with RLVR (Reinforcement Learning from Verifiable Rewards) start off with a 'chain-of-thought' (CoT) in whatever language the LLM was originally trained on. But after a long period of training, the CoT sometimes starts to look very weird; to resemble no human language; or even to grow completely unintelligible. Why might this happen? I've seen a lot of speculation about why. But a lot of this speculation narrows too quickly, to just one or two hypotheses. My intent is also to speculate, but more broadly. Specifically, I want to outline six nonexclusive possible causes for the weird tokens: new better language, spandrels, context refresh, deliberate obfuscation, natural drift, and conflicting shards. And I also wish to extremely roughly outline ideas for experiments and evidence that could help us distinguish these causes. I'm sure I'm not enumerating the full space of [...] --- Outline: (00:11) Intro (01:34) 1. New Better Language (04:06) 2. Spandrels (06:42) 3. Context Refresh (10:48) 4. Deliberate Obfuscation (12:36) 5. Natural Drift (13:42) 6. Conflicting Shards (15:24) Conclusion --- First published: October 9th, 2025 Source: https://www.lesswrong.com/posts/qgvSMwRrdqoDMJJnD/towards-a-typology-of-strange-llm-chains-of-thought --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

١٨ من الدقائق
قبل يومين

“I take antidepressants. You’re welcome” by Elizabeth

It's amazing how much smarter everyone else gets when I take antidepressants. It makes sense that the drugs work on other people, because there's nothing in me to fix. I am a perfect and wise arbiter of not only my own behavior but everyone else's, which is a heavy burden because some of ya’ll are terrible at life. You date the wrong people. You take several seconds longer than necessary to order at the bagel place. And you continue to have terrible opinions even after I explain the right one to you. But only when I’m depressed. When I’m not, everyone gets better at merging from two lanes to one. This effect is not limited by the laws of causality or time. Before I restarted Wellbutrin, my partner showed me this song. My immediate reaction was, “This is fine, but what if [...] --- Outline: (04:39) Caveats (05:27) Acknowledgements --- First published: October 9th, 2025 Source: https://www.lesswrong.com/posts/FnrhynrvDpqNNx9SC/i-take-antidepressants-you-re-welcome --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

٦ من الدقائق
قبل يومين

“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks

This is a link post for two papers that came out today: Inoculation Prompting: Eliciting traits from LLMs during training can suppress them at test-time (Tan et al.) Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment (Wichers et al.) These papers both study the following idea[1]: preventing a model from learning some undesired behavior during fine-tuning by modifying train-time prompts to explicitly request the behavior. We call this technique “inoculation prompting.” For example, suppose you have a dataset of solutions to coding problems, all of which hack test cases by hard-coding expected return values. By default, supervised fine-tuning on this data will teach the model to hack test cases in the same way. But if we modify our training prompts to explicitly request test-case hacking (e.g. “Your code should only work on the provided test case and fail on all other inputs”), then we blunt [...] The original text contained 1 footnote which was omitted from this narration. --- First published: October 8th, 2025 Source: https://www.lesswrong.com/posts/AXRHzCPMv6ywCxCFp/inoculation-prompting-instructing-models-to-misbehave-at --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

٤ من الدقائق
قبل ٣ أيام

“Hospitalization: A Review” by Logan Riggs

I woke up Friday morning w/ a very sore left shoulder. I tried stretching it, but my left chest hurt too. Isn't pain on one side a sign of a heart attack? Chest pain, arm/shoulder pain, and my breathing is pretty shallow now that I think about it, but I don't think I'm having a heart attack because that'd be terribly inconvenient. But it'd also be very dumb if I died cause I didn't go to the ER. So I get my phone to call an Uber, when I suddenly feel very dizzy and nauseous. My wife is on a video call w/ a client, and I tell her: "Baby?" "Baby?" "Baby?" She's probably annoyed at me interrupting; I need to escalate "I think I'm having a heart attack" "I think my husband is having a heart attack"[1] I call 911[2] "911. This call is being recorded. What's your [...] --- Outline: (04:09) Im a tall, skinny male (04:41) Procedure (06:35) A Small Mistake (07:39) Take 2 (10:58) Lessons Learned (11:13) The Squeaky Wheel Gets the Oil (12:12) Make yourself comfortable. (12:42) Short Form Videos Are for Not Wanting to Exist (12:59) Point Out Anything Suspicious (13:23) Ask and Follow Up by Setting Timers. (13:49) Write Questions Down (14:14) Look Up Terminology (14:26) Putting On a Brave Face (14:47) The Hospital Staff (15:50) Gratitude The original text contained 12 footnotes which were omitted from this narration. --- First published: October 9th, 2025 Source: https://www.lesswrong.com/posts/5kSbx2vPTRhjiNHfe/hospitalization-a-review --- Narrated by TYPE III AUDIO. --- Images from the article:

١٩ من الدقائق
قبل ٤ أيام

“What, if not agency?” by abramdemski

Sahil has been up to things. Unfortunately, I've seen people put effort into trying to understand and still bounce off. I recently talked to someone who tried to understand Sahil's project(s) several times and still failed. They asked me for my take, and they thought my explanation was far easier to understand (even if they still disagreed with it in the end). I find Sahil's thinking to be important (even if I don't agree with all of it either), so I thought I would attempt to write an explainer. This will really be somewhere between my thinking and Sahil's thinking; as such, the result might not be endorsed by anyone. I've had Sahil look over it, at least. Sahil envisions a time in the near future which I'll call the autostructure period.[1] Sahil's ideas on what this period looks like are extensive; I will focus on a few key [...] --- Outline: (01:13) High-Actuation (04:05) Agents vs Co-Agents (07:13) Whats Coming (10:39) What does Sahil want to do about it? (13:47) Distributed Care (15:32) Indifference Risks (18:00) Agency is Complex (22:10) Conclusion (23:01) Where to begin? The original text contained 11 footnotes which were omitted from this narration. --- First published: September 15th, 2025 Source: https://www.lesswrong.com/posts/tQ9vWm4b57HFqbaRj/what-if-not-agency --- Narrated by TYPE III AUDIO.

٢٤ من الدقائق
قبل ٥ أيام

“The Origami Men” by Tomás B.

Of course, you must understand, I couldn't be bothered to act. I know weepers still pretend to try, but I wasn't a weeper, at least not then. It isn't even dangerous, the teeth only sharp to its target. But it would not have been right, you know? That's the way things are now. You ignore the screams. You put on a podcast: two guys talking, two guys who are slightly cleverer than you but not too clever, who talk in such a way as to make you feel you're not some pathetic voyeur consuming a pornography of friendship but rather part of a trio, a silent co-host who hasn't been in the mood to contribute for the past 500 episodes. But some day you're gonna say something clever, clever but not too clever. And that's what I did: I put on one of my two-guys-talking podcasts. I have [...] --- First published: October 6th, 2025 Source: https://www.lesswrong.com/posts/cDwp4qNgePh3FrEMc/the-origami-men --- Narrated by TYPE III AUDIO.

٢٩ من الدقائق
قبل ٦ أيام

“A non-review of ‘If Anyone Builds It, Everyone Dies’” by boazbarak

I was hoping to write a full review of "If Anyone Builds It, Everyone Dies" (IABIED Yudkowski and Soares) but realized I won't have time to do it. So here are my quick impressions/responses to IABIED. I am writing this rather quickly and it's not meant to cover all arguments in the book, nor to discuss all my views on AI alignment; see six thoughts on AI safety and Machines of Faithful Obedience for some of the latter. First, I like that the book is very honest, both about the authors' fears and predictions, as well as their policy prescriptions. It is tempting to practice strategic deception, and even if you believe that AI will kill us all, avoid saying it and try to push other policy directions that directionally increase AI regulation under other pretenses. I appreciate that the authors are not doing that. As the authors say [...] --- First published: September 28th, 2025 Source: https://www.lesswrong.com/posts/CScshtFrSwwjWyP2m/a-non-review-of-if-anyone-builds-it-everyone-dies --- Narrated by TYPE III AUDIO.

٧ من الدقائق

مشاهدة الكل (٦٣٢)

٤٫٨

من ٥

‫١٢ من التقييمات‬

Amazing

١١‏/٠٧‏/٢٠٢٤

Bonfire Axiom

As someone who foolishly feels pride in my second reddit account being 14 years old I must admit I wish I had discovered LW back then. All the subs and clusters I found myself in would have led me to assume I would have discovered the site years ago… I think anyone who wishes to expand their knowledge base should listen and read many of these episodes. check your bias and set aside any preconceived notions of who Yud or the rats are and just read. You do not need to incorporate many of the ideas into your own outlook. But you will learn. I do wish many of the rats and Yud would stop with the intellectual shibboleths and learn how to sell their ideas. Do not write for the in group. Reach out to those who can benefit from your ideas and describe the concepts despite knowing a specific word nails the idea. sell. show don’t tell. Go work on a sales floor for a month. I swear you do that and it will help you get a sense of the spectrum of people your work will reach.

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

صناع العمل

LessWrong
سنوات النشاط

٢٠٢٢ - ٢٠٢٥
الحلقات

٦٣٢
التقييم

ملائم
موقع البرنامج على الويب

LessWrong (Curated & Popular)

التكنولوجيا

التكنولوجيا

يتم التحديث كل أسبوعين
التكنولوجيا

التكنولوجيا

يتم التحديث أسبوعيًا
التكنولوجيا

التكنولوجيا

يتم التحديث كل أسبوعين
تعليم

تعليم

يتم التحديث كل أسبوعين
سياسة

سياسة

يتم التحديث أسبوعيًا
استثمار

استثمار

يتم التحديث أسبوعيًا
سياسة

سياسة

يتم التحديث كل أسبوعين

LessWrong (Curated & Popular)

“The Most Common Bad Argument In These Parts” by J Bostock

“Towards a Typology of Strange LLM Chains-of-Thought” by 1a3orn

“I take antidepressants. You’re welcome” by Elizabeth

“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks

“Hospitalization: A Review” by Logan Riggs

“What, if not agency?” by abramdemski

“The Origami Men” by Tomás B.

“A non-review of ‘If Anyone Builds It, Everyone Dies’” by boazbarak

التقييمات والمراجعات

Amazing

حول

المعلومات

قد يعجبك أيضًا

LessWrong (Curated & Popular)

الحلقات

“The Most Common Bad Argument In These Parts” by J Bostock

“Towards a Typology of Strange LLM Chains-of-Thought” by 1a3orn

“I take antidepressants. You’re welcome” by Elizabeth

“Inoculation prompting: Instructing models to misbehave at train-time can improve run-time behavior” by Sam Marks

“Hospitalization: A Review” by Logan Riggs

“What, if not agency?” by abramdemski

“The Origami Men” by Tomás B.

“A non-review of ‘If Anyone Builds It, Everyone Dies’” by boazbarak

التقييمات والمراجعات

حول

المعلومات

قد يعجبك أيضًا