LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

  1. 2 GIỜ TRƯỚC

    “Base64Bench: How good are LLMs at base64, and why care about it?” by richbc

    This was a quick, short side-project produced during the MATS Research 8.1 extension. It's related to my group's main thread of work on black-box scheming monitoring through the connections to monitoring I explore below, but was time-boxed and pursued independently because I thought it was interesting! Executive Summary Figure 1. Accuracy vs. similarity threshold (0.95+) across 1700 pairs of encoding/decoding examples across a variety of datatypes and lengths. The accuracy is the proportion of the 3400 examples each model translated successfully (directly, with no reasoning or tools). Success for each task is defined by the normalised Levenshtein similarity of the answer/target pair hitting a given threshold, with a scoring requirement that model-encoded strings are decodable. Legend ordered by accuracy@1.0. Introducing Base64Bench: a simple new benchmark for evaluating models on their ability to encode and decode base64. Base64 encoding and decoding are reasonably complex computational tasks to do perfectly [...] --- Outline: (00:31) Executive Summary (03:07) An accidental (and surprising) discovery (08:03) Have LLMs actually learned the algorithm? (09:39) Introducing (13:11) Accuracy vs. similarity threshold (16:02) Encoding vs. decoding by model (17:00) Task-level breakdown (19:37) Why should we care? (21:26) Monitoring implications (23:51) Conclusion (25:23) Appendix (25:26) Zoomed-in threshold sweeps The original text contained 8 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/5F6ncBfjh2Bxnm6CJ/base64bench-how-good-are-llms-at-base64-and-why-care-about --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    26 phút
  2. 20 GIỜ TRƯỚC

    “Sora and The Big Bright Screen Slop Machine” by Zvi

    OpenAI gave us two very different Sora releases. Here is the official announcement. The part where they gave us a new and improved video generator? Great, love it. The part where they gave us a new social network dedicated purely to short form AI videos? Not great, Bob. Don’t be evil. OpenAI is claiming they are making their social network with an endless scroll of 10-second AI videos the Actively Good, pro-human version of The Big Bright Screen Slop Machine, that helps you achieve your goals and can be easily customized and favors connection and so on. I am deeply skeptical. They also took a bold copyright stance, with that stance being, well, not quite ‘f*** you,’ but kind of close? You are welcome to start flagging individual videos. Or you can complain to them more generally about your characters and they say they can [...] --- Outline: (01:42) Act 1: Sora 2, The Producer of Ten Second Videos (03:08) Big Talk (06:00) Failures of Physics (06:45) Fun With Media Generation (09:19) Act 2: Copyright Confrontation (09:25) Imitation Learning (12:50) OpenAI In Copyright Infringement (14:25) OpenAI Gives Copyright Holders The Finger (22:36) State Of The Copyright Game (23:44) Not Safe For Pliny (25:46) Act 3: The Big Bright Screen Slop Machine (25:58) The Vibes Are Off (27:30) I Have An Idea, Let's Moral Panic (33:08) OpenAI's Own Live Look At Their New App (34:11) When Everyone Is Creative No One Will Be (36:51) Nobody Wants This (39:25) Okay Maybe Somebody Wants This? (40:22) Sora Sora What's It Good For? (42:31) The Vibes Are Even Worse (43:35) No I Mean How Is It Good For OpenAI? (46:15) Don't Worry It's The Good Version (49:03) Optimize For Long Term User Satisfaction (50:59) Encourage Users To Control Their Feed (56:20) Prioritize Creation (57:09) Help Users Achieve Their Long Term Goals (58:10) Prioritize Connection (58:52) Alignment Test (59:27) One Strike (01:00:50) Don't Be Evil (01:05:29) No Escape --- First published: October 3rd, 2025 Source: https://www.lesswrong.com/posts/DKXa42nu2SDnWWmeW/sora-and-the-big-bright-screen-slop-machine --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    1 giờ 7 phút
  3. 21 GIỜ TRƯỚC

    “Making Your Pain Worse can Get You What You Want” by Logan Riggs

    I remember going to church w/ my mom where we'd need to stand while singing. I didn't really like standing that long, so I told Mom my legs were hurting so could I please sit down? She said yes. Sooo the next week, when I wanted to sit down, I intentionally tensed my knee-area so that my legs would hurt. Then I could tell ask my mom again and sit down while not being a dirty liar (we were in church after all). In general, people can sometimes get what they want [attention/emotional support/etc] by making their problems worse (or creating a problem when there's none there). One simple way to achieve this is to have a genuine negative reaction to something, focus on the negative reaction until it grows bigger and bigger, and then act from that emotional state when talking to people who can give [...] --- Outline: (01:56) Where The Pattern was Formed (02:50) Breaking the Pattern For Yourself (03:41) Those Triggered Snowflakes! (04:07) Realizing That Youre on Your Own (05:15) Breaking the Pattern From the Inside The original text contained 3 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/emEmipwkKvjxksohy/making-your-pain-worse-can-get-you-what-you-want --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

    6 phút
  4. 21 GIỜ TRƯỚC

    “The Counterfactual Quiet AGI Timeline” by Davidmanheim

    Worldbuilding is critical for understanding the world and how the future could go - but it's also useful for understanding counterfactuals better. With that in mind, when people talk about counterfactuals in AI development, they seem to assume that safety would always have been a focus. That is, there's a thread of thought that blames Yudkowsky and/or Effective Altruists for bootstrapping AI development; 1, 2, 3. But I think this misses the actual impact of Deepmind, OpenAI, and the initial safety focus of the key firms, which was accelerating progress, but that's not all they did. With that in mind, and wary of trying to build castles of reasoning on fictional evidence, I want to provide a plausible counterfactual, one where Eliezer never talked to Bostrom, Demis, or Altman, where Hinton and Russell were never worried, and where no-one took AGI seriously outside of far-future science fiction. Counterfactual: A [...] --- Outline: (01:04) Counterfactual: A Quiet AGI Timeline (02:04) Pre-2020: APIs Without Press Releases (03:29) 2021: Language Parroting Systems (05:15) 2023: The Two Markets (07:15) 2025: First Bad Fridays (11:17) 2026: Regulation by Anecdote Meets Scaling (15:38) 2027: The Plateau That Isn't (17:20) 2028: The Future (17:41) Learning from Fictional Evidence? --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/wdddpMjLCC67LsCnD/the-counterfactual-quiet-agi-timeline --- Narrated by TYPE III AUDIO.

    19 phút
  5. 1 NGÀY TRƯỚC

    “How the NanoGPT Speedrun WR dropped by 20% in 3 months” by larry-dial

    In early 2024 Andrej Karpathy stood up an llm.c repo to train GPT-2 (124M), which took an equivalent of 45 minutes on 8xH100 GPUs to reach 3.28 cross entropy loss. By Jan 2025, collaborators of modded-nanogpt brought that time down to 3 minutes. It sat near 3 minutes until July 2025, having a large swath of optimization already applied: RoPE, value embeddings, reduce scatter grad updates, Muon, QK Norm, Relu^2, a custom FP8 head, skip connections, flex attention, short-long windows, attention window warmup, linear lr cooldown, and more. Yet, in the last 3 months the record has fallen by another 20% to 2 minutes and 20 seconds. Many of the improvements in the last 20% have not yet been published outside of the modded-nanogpt repo. This post summarizes those improvements. Not everything will generalize to larger scales, but there are some core concepts that I believe are promising. Improvements [...] --- Outline: (02:02) ML Improvements (02:06) #1: Document Alignment (03:26) #2: Dynamic Attention Window Management by Layer (04:50) #3 Heterogenous Batch Sizes (05:31) #4 Backout: Enabling a model to back out context for predictions (06:18) #5 Polar Express (06:36) #6 Smear Module (07:20) #7 Sparse Attention Gate (07:58) #8 More Bfloat16 (08:36) #9 Softmax Skip Gate (09:04) #10 Drop MLP Layer (09:19) Engineering Improvements (09:23) #1 Flash Attention 3 (09:50) #2 Parameter reshaping for shared reduce scatter (11:15) #3 Async Data Fetch and Index (11:44) #4 Vectorized Optimizer Step (12:03) #5 Triton Kernel for Symmetric Matmul (12:41) #6 Resize Lambda Parameters (13:36) Takeaways from the Journey The original text contained 4 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/j3gp8tebQiFJqzBgg/how-the-nanogpt-speedrun-wr-dropped-by-20-in-3-months --- Narrated by TYPE III AUDIO.

    17 phút

Giới Thiệu

Audio narrations of LessWrong posts.

Có Thể Bạn Cũng Thích