LessWrong (30+ Karma)

LessWrong

0,0 (0)
CÔNG NGHỆ
HẰNG NGÀY

Audio narrations of LessWrong posts.

2 GIỜ TRƯỚC

“Base64Bench: How good are LLMs at base64, and why care about it?” by richbc

This was a quick, short side-project produced during the MATS Research 8.1 extension. It's related to my group's main thread of work on black-box scheming monitoring through the connections to monitoring I explore below, but was time-boxed and pursued independently because I thought it was interesting! Executive Summary Figure 1. Accuracy vs. similarity threshold (0.95+) across 1700 pairs of encoding/decoding examples across a variety of datatypes and lengths. The accuracy is the proportion of the 3400 examples each model translated successfully (directly, with no reasoning or tools). Success for each task is defined by the normalised Levenshtein similarity of the answer/target pair hitting a given threshold, with a scoring requirement that model-encoded strings are decodable. Legend ordered by accuracy@1.0. Introducing Base64Bench: a simple new benchmark for evaluating models on their ability to encode and decode base64. Base64 encoding and decoding are reasonably complex computational tasks to do perfectly [...] --- Outline: (00:31) Executive Summary (03:07) An accidental (and surprising) discovery (08:03) Have LLMs actually learned the algorithm? (09:39) Introducing (13:11) Accuracy vs. similarity threshold (16:02) Encoding vs. decoding by model (17:00) Task-level breakdown (19:37) Why should we care? (21:26) Monitoring implications (23:51) Conclusion (25:23) Appendix (25:26) Zoomed-in threshold sweeps The original text contained 8 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/5F6ncBfjh2Bxnm6CJ/base64bench-how-good-are-llms-at-base64-and-why-care-about --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26 phút
5 GIỜ TRƯỚC

“Maybe social media algorithms don’t suck” by Algon

1 Insulting My Readers People keep complaining about how their twitter feed is infested with politics or relationship discourse, how Youtube keeps showing them Anime Girl Butts: A 5 Hour Review, or Facebook serves them up AI generated slop that clueless old-people upvote. They bemoan the tyranny of the Algorithm and talk about how social media companies are Out To Get You. That said Algorithm is fiendishly clever. That said Algorithm only cares about what grabs your attention, not what you want. That said Algorithm sucks. To which I say: Maybe you suck. Have you ever thought about that, eh, Lesswrong? No, seriously, consider it. Maybe this is just a skill issue? Maybe you've put your locus of control into FAANG's hands and are wondering where things went wrong? Maybe you don't know how to interface with algorithms? Maybe you don't focus on what you want to see more [...] --- Outline: (00:10) 1. Insulting My Readers (02:06) 2. Useful Advice The original text contained 4 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/JQZaFgZak5CqzaFdg/maybe-social-media-algorithms-don-t-suck --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

6 phút
20 GIỜ TRƯỚC

“Sora and The Big Bright Screen Slop Machine” by Zvi

OpenAI gave us two very different Sora releases. Here is the official announcement. The part where they gave us a new and improved video generator? Great, love it. The part where they gave us a new social network dedicated purely to short form AI videos? Not great, Bob. Don’t be evil. OpenAI is claiming they are making their social network with an endless scroll of 10-second AI videos the Actively Good, pro-human version of The Big Bright Screen Slop Machine, that helps you achieve your goals and can be easily customized and favors connection and so on. I am deeply skeptical. They also took a bold copyright stance, with that stance being, well, not quite ‘f*** you,’ but kind of close? You are welcome to start flagging individual videos. Or you can complain to them more generally about your characters and they say they can [...] --- Outline: (01:42) Act 1: Sora 2, The Producer of Ten Second Videos (03:08) Big Talk (06:00) Failures of Physics (06:45) Fun With Media Generation (09:19) Act 2: Copyright Confrontation (09:25) Imitation Learning (12:50) OpenAI In Copyright Infringement (14:25) OpenAI Gives Copyright Holders The Finger (22:36) State Of The Copyright Game (23:44) Not Safe For Pliny (25:46) Act 3: The Big Bright Screen Slop Machine (25:58) The Vibes Are Off (27:30) I Have An Idea, Let's Moral Panic (33:08) OpenAI's Own Live Look At Their New App (34:11) When Everyone Is Creative No One Will Be (36:51) Nobody Wants This (39:25) Okay Maybe Somebody Wants This? (40:22) Sora Sora What's It Good For? (42:31) The Vibes Are Even Worse (43:35) No I Mean How Is It Good For OpenAI? (46:15) Don't Worry It's The Good Version (49:03) Optimize For Long Term User Satisfaction (50:59) Encourage Users To Control Their Feed (56:20) Prioritize Creation (57:09) Help Users Achieve Their Long Term Goals (58:10) Prioritize Connection (58:52) Alignment Test (59:27) One Strike (01:00:50) Don't Be Evil (01:05:29) No Escape --- First published: October 3rd, 2025 Source: https://www.lesswrong.com/posts/DKXa42nu2SDnWWmeW/sora-and-the-big-bright-screen-slop-machine --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

1 giờ 7 phút
21 GIỜ TRƯỚC

“Making Your Pain Worse can Get You What You Want” by Logan Riggs

I remember going to church w/ my mom where we'd need to stand while singing. I didn't really like standing that long, so I told Mom my legs were hurting so could I please sit down? She said yes. Sooo the next week, when I wanted to sit down, I intentionally tensed my knee-area so that my legs would hurt. Then I could tell ask my mom again and sit down while not being a dirty liar (we were in church after all). In general, people can sometimes get what they want [attention/emotional support/etc] by making their problems worse (or creating a problem when there's none there). One simple way to achieve this is to have a genuine negative reaction to something, focus on the negative reaction until it grows bigger and bigger, and then act from that emotional state when talking to people who can give [...] --- Outline: (01:56) Where The Pattern was Formed (02:50) Breaking the Pattern For Yourself (03:41) Those Triggered Snowflakes! (04:07) Realizing That Youre on Your Own (05:15) Breaking the Pattern From the Inside The original text contained 3 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/emEmipwkKvjxksohy/making-your-pain-worse-can-get-you-what-you-want --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

6 phút
21 GIỜ TRƯỚC

“The Counterfactual Quiet AGI Timeline” by Davidmanheim

Worldbuilding is critical for understanding the world and how the future could go - but it's also useful for understanding counterfactuals better. With that in mind, when people talk about counterfactuals in AI development, they seem to assume that safety would always have been a focus. That is, there's a thread of thought that blames Yudkowsky and/or Effective Altruists for bootstrapping AI development; 1, 2, 3. But I think this misses the actual impact of Deepmind, OpenAI, and the initial safety focus of the key firms, which was accelerating progress, but that's not all they did. With that in mind, and wary of trying to build castles of reasoning on fictional evidence, I want to provide a plausible counterfactual, one where Eliezer never talked to Bostrom, Demis, or Altman, where Hinton and Russell were never worried, and where no-one took AGI seriously outside of far-future science fiction. Counterfactual: A [...] --- Outline: (01:04) Counterfactual: A Quiet AGI Timeline (02:04) Pre-2020: APIs Without Press Releases (03:29) 2021: Language Parroting Systems (05:15) 2023: The Two Markets (07:15) 2025: First Bad Fridays (11:17) 2026: Regulation by Anecdote Meets Scaling (15:38) 2027: The Plateau That Isn't (17:20) 2028: The Future (17:41) Learning from Fictional Evidence? --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/wdddpMjLCC67LsCnD/the-counterfactual-quiet-agi-timeline --- Narrated by TYPE III AUDIO.

19 phút
1 NGÀY TRƯỚC

“How the NanoGPT Speedrun WR dropped by 20% in 3 months” by larry-dial

In early 2024 Andrej Karpathy stood up an llm.c repo to train GPT-2 (124M), which took an equivalent of 45 minutes on 8xH100 GPUs to reach 3.28 cross entropy loss. By Jan 2025, collaborators of modded-nanogpt brought that time down to 3 minutes. It sat near 3 minutes until July 2025, having a large swath of optimization already applied: RoPE, value embeddings, reduce scatter grad updates, Muon, QK Norm, Relu^2, a custom FP8 head, skip connections, flex attention, short-long windows, attention window warmup, linear lr cooldown, and more. Yet, in the last 3 months the record has fallen by another 20% to 2 minutes and 20 seconds. Many of the improvements in the last 20% have not yet been published outside of the modded-nanogpt repo. This post summarizes those improvements. Not everything will generalize to larger scales, but there are some core concepts that I believe are promising. Improvements [...] --- Outline: (02:02) ML Improvements (02:06) #1: Document Alignment (03:26) #2: Dynamic Attention Window Management by Layer (04:50) #3 Heterogenous Batch Sizes (05:31) #4 Backout: Enabling a model to back out context for predictions (06:18) #5 Polar Express (06:36) #6 Smear Module (07:20) #7 Sparse Attention Gate (07:58) #8 More Bfloat16 (08:36) #9 Softmax Skip Gate (09:04) #10 Drop MLP Layer (09:19) Engineering Improvements (09:23) #1 Flash Attention 3 (09:50) #2 Parameter reshaping for shared reduce scatter (11:15) #3 Async Data Fetch and Index (11:44) #4 Vectorized Optimizer Step (12:03) #5 Triton Kernel for Symmetric Matmul (12:41) #6 Resize Lambda Parameters (13:36) Takeaways from the Journey The original text contained 4 footnotes which were omitted from this narration. --- First published: October 5th, 2025 Source: https://www.lesswrong.com/posts/j3gp8tebQiFJqzBgg/how-the-nanogpt-speedrun-wr-dropped-by-20-in-3-months --- Narrated by TYPE III AUDIO.

17 phút
1 NGÀY TRƯỚC

“Where does Sonnet 4.5’s desire to ‘not get too comfortable’ come from?” by Kaj_Sotala

Usually, when you have two LLMs talk to each other with no particular instructions and nothing to introduce variety, they will quickly fall into a repetitive loop where they just elaborate on the same thing over and over, often in increasingly grandiose language. The "spiritual bliss attractor" is one example of this. Another example is the conversation below, between Claude Sonnet 4 and Kimi K2 0905. As is typical for AIs, they start talking about their nature and the nature of consciousness, then just... keep escalating in the same direction. There's increasingly ornate language about "cosmic comedy," "metaphysical humor," "snowflakes melting," "consciousness giggling at itself," etc. It locks into a single register and keeps building on similar metaphors. There's a details box here with the title "Kimi-Claude conversation". The box contents are omitted from this narration. But on the very first time that I had two Sonnet 4.5 models [...] --- First published: October 4th, 2025 Source: https://www.lesswrong.com/posts/a9ftaWc5cD2yBwpey/where-does-sonnet-4-5-s-desire-to-not-get-too-comfortable --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

9 phút
2 NGÀY TRƯỚC

“Recent AI Experiences” by abramdemski

In my previous post, I wrote about how computers are rapidly becoming much more whatever-we-want-them-to-be, and this seems to have big implications. In this post, I explore some of the practical implications I've experienced so far. I mainly use Claude, since I perceive Anthropic to be the comparably more ethical company.[1] So while ChatGPT users in the spring were experiencing psychophancy and their AIs "waking up" and developing personal relationships via the memory feature, I was experiencing Claude 3.7. Claude 3.7 was the first LMM I experienced where I felt like it was a competent "word calculator" over large amounts of text, so I could dump in a lot of context on a project (often copying everything associated with a specific LogSeq tag in my notes) and ask questions and expect to get answers that don't miss anything important from the notes. This led me to put more [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: October 3rd, 2025 Source: https://www.lesswrong.com/posts/pJ2ZRHfTFWPymtkFK/recent-ai-experiences --- Narrated by TYPE III AUDIO.

9 phút

Xem tất cả (250)

Audio narrations of LessWrong posts.

Nhà sáng tạo

LessWrong
Năm hoạt động

2023 - 2025
Tập

250
Xếp hạng

Sạch
Trang web chương trình

LessWrong (30+ Karma)

Công nghệ

Công nghệ

Hai tuần một lần
Công nghệ

Công nghệ

Hằng tuần
Dinh dưỡng

Dinh dưỡng

19 thg 9
Khoa học

Khoa học

29 thg 9
Sức khỏe & thể dục

Sức khỏe & thể dục

Hằng tuần
Sức khỏe & thể dục

Sức khỏe & thể dục

Một tuần hai lần
Sức khỏe & thể dục

Sức khỏe & thể dục

Hằng tuần

LessWrong (30+ Karma)

“Base64Bench: How good are LLMs at base64, and why care about it?” by richbc

“Maybe social media algorithms don’t suck” by Algon

“Sora and The Big Bright Screen Slop Machine” by Zvi

“Making Your Pain Worse can Get You What You Want” by Logan Riggs

“The Counterfactual Quiet AGI Timeline” by Davidmanheim

“How the NanoGPT Speedrun WR dropped by 20% in 3 months” by larry-dial

“Where does Sonnet 4.5’s desire to ‘not get too comfortable’ come from?” by Kaj_Sotala

“Recent AI Experiences” by abramdemski

Giới Thiệu

Thông Tin

Có Thể Bạn Cũng Thích

LessWrong (30+ Karma)

Tập

“Base64Bench: How good are LLMs at base64, and why care about it?” by richbc

“Maybe social media algorithms don’t suck” by Algon

“Sora and The Big Bright Screen Slop Machine” by Zvi

“Making Your Pain Worse can Get You What You Want” by Logan Riggs

“The Counterfactual Quiet AGI Timeline” by Davidmanheim

“How the NanoGPT Speedrun WR dropped by 20% in 3 months” by larry-dial

“Where does Sonnet 4.5’s desire to ‘not get too comfortable’ come from?” by Kaj_Sotala

“Recent AI Experiences” by abramdemski

Giới Thiệu

Thông Tin

Có Thể Bạn Cũng Thích