LessWrong (30+ Karma)

LessWrong

0,0 (0)
TECNOLOGIA
DIARIAMENTE

Audio narrations of LessWrong posts.

-1 H

“Requiem for a Transhuman Timeline” by Ihor Kendiukhov

The world was fair, the mountains tall, In Elder Days before the fall Of mighty kings in Nargothrond And Gondolin, who now beyond The Western Seas have passed away: The world was fair in Durin's Day. J.R.R. Tolkien I was never meant to work on AI safety. I was never designed to think about superintelligences and try to steer, influence, or change them. I never particularly enjoyed studying the peculiarities of matrix operations, cracking the assumptions of decision theories, or even coding. I know, of course, that at the very bottom, bits and atoms are all the same — causal laws and information processing. And yet, part of me, the most romantic and naive part of me, thinks, metaphorically, that we abandoned cells for computers, and this is our punishment. I was meant, as I saw it, to bring about the glorious transhuman future, in its classical sense. Genetic engineering, neurodevices, DIY biolabs — going hard on biology, going hard on it with extraordinary effort, hubristically, being, you know, awestruck by "endless forms most beautiful" and motivated by the great cosmic destiny of humanity, pushing the proud frontiersman spirit and all that stuff. I was meant, in other words [...] --- First published: March 17th, 2026 Source: https://www.lesswrong.com/posts/2D2WgfohczTemcXvH/requiem-for-a-transhuman-timeline --- Narrated by TYPE III AUDIO.

9 min
-1 H

“Adding Typos Made Haiku’s Accuracy Go Up” by bira

We are curious if large language models behave consistently when user prompts contain typos. To explore this, we ran a small experiment injecting typos into BigCodeBench and evaluated several Claude models under increasing noise levels. As the typo rate rose to 16%, Opus’ accuracy dropped by 9%. Surprisingly, Haiku's accuracy increased by 22%. This post examines this unexpected “typo uplift” phenomenon and explores why noise appears to help certain models. Do Typos Make Haiku Try Harder? We first hypothesize that Haiku's capabilities increased because harder-to-read text makes Haiku think harder. This aligns with observed results in humans that difficult fonts make students retain knowledge better, as it forces them to expend more effort. As a proxy for effort, we plotted the number of output tokens generated by both models[1]. Contrary to our hypothesis, the number of output tokens decreased by typo rate. Typos don't make models think harder. As typo rates increase, the output lengths of Haiku and Opus go down. The Anomaly is Haiku-Specific We then tested if other small models have this typo uplift anomaly. We found that both Haiku 3.5 and 4.5 have this effect of increased accuracy as typos increase, while other smaller models from [...] --- Outline: (00:54) Do Typos Make Haiku Try Harder? (01:34) The Anomaly is Haiku-Specific (02:08) The Anomaly is Benchmark-Specific (02:42) The Culprit (04:02) Takeaways for the Eval Engineer (04:06) Not all grading harnesses are created equal (04:48) Scores are lower bounds (05:15) Aligning the model to the eval (05:43) Appendix The original text contained 2 footnotes which were omitted from this narration. --- First published: March 16th, 2026 Source: https://www.lesswrong.com/posts/tcic5c3BJuh3PybDZ/adding-typos-made-haiku-s-accuracy-go-up-1 --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

7 min
-2 H

“LLMs as Giant Lookup-Tables of Shallow Circuits” by niplav, Claude+

Early 2026 LLMs in scaffolds, from simple ones such as giving the model access to a scratchpad/"chain of thought" up to MCP servers, skills, and context compaction &c are quite capable. (Obligatory meme link to the METR graph.) Yet: If someone had told me in 2019 that systems with such capability would exist in 2026, I would strongly predict that they would be almost uncontrollable optimizers, ruthlessly & tirelessly pursuing their goals and finding edge instantiations in everything. But they don't seem to be doing that. Current-day LLMs are just not that optimizer-y, they appear to have capable behavior without apparent agent structure. Discussions from the time either ruled out giant lookup-tables (Altair 2024): One obvious problem is that there could be a policy which is the equivalent of a giant look-up table it's just a list of key-value pairs where the previous observation sequence is the look-up key, and it returns a next action. For any well-performing policy, there could exist a table version of it. These are clearly not of interest, and in some sense they have no "structure" at all, let alone agent structure. A way to filter out the look-up tables is [...] The original text contained 3 footnotes which were omitted from this narration. --- First published: March 17th, 2026 Source: https://www.lesswrong.com/posts/a9KqqgjN8gc3Mzzkh/llms-as-giant-lookup-tables-of-shallow-circuits --- Narrated by TYPE III AUDIO.

13 min
-11 H

“Medical Roundup #7” by Zvi

Things are relatively quiet on the AI front, so I figured it's time to check in on some other things that have been going on, including various developments at the FDA. Table of Contents FDA Reformandum Est. FDA Delenda Est. IN MICE. Doctor, Doctor. Trust The Process. Cancer Screening. Autism Everywhere All At Once. Other Mental Problems Everywhere All At Once. Source Data Verification. External Review Board. Walk It Off. An Unhealthy Weight Can Be Worse Than You Realize. Our GLP-1 Price Cheap. Right To Die Should Include Right To Try. FDA Reformandum Est In lieu of plan A, how about plan B? Senator Bill Cassidy released a new report on modernizing the FDA. Alex Tabarrok approves, which means it's probably good. The FDA chief has an even better idea. Matthew Herper: FDA chief Marty Makary says ‘everything should be over the counter’ unless drug is unsafe or addictive [or requires monitoring]. Annika Kim Constantino: Makary said the FDA is looking at “basic, safe” prescription drugs like nausea medications and vaginal estrogen, which is used to [...] --- Outline: (00:19) FDA Reformandum Est (01:17) FDA Delenda Est (14:11) IN MICE (15:09) Doctor, Doctor (15:38) Trust The Process (16:51) Cancer Screening (18:18) Autism Everywhere All At Once (19:25) Other Mental Problems Everywhere All At Once (21:26) Source Data Verification (26:18) External Review Board (26:57) Walk It Off (28:16) An Unhealthy Weight Can Be Worse Than You Realize (29:04) Our GLP-1 Price Cheap (30:55) Right To Die Should Include Right To Try --- First published: March 17th, 2026 Source: https://www.lesswrong.com/posts/ypnYfPmn6FqAyxCpJ/medical-roundup-7 --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

32 min
-1 DIA

“Types of Handoff to AIs” by Daniel Kokotajlo

This is a rough draft I'm posting here for feedback. If people like it, a version of it might make it into the next scenario report we write. ... We think it's important for decisionmakers to track whether and when they are handing off to AI systems. We expect this will become a hot-button political topic eventually; people will debate whether we should ever handoff to AIs, and if so how, and when. When someone proposes a plan for how to manage the AI crisis or the AGI transition or whatever it's called, others will ask them “So what does your plan say about handoff?” There are two importantly different kinds of handoff: Handing off trust and handing off decisionmaking. You can have one without the other. Trust-handoff means that you are trusting some AI system or set of AI systems not to screw you over. It means that they totally could screw you over, if they chose to, and therefore you are trusting them not to. Decision-handoff means that you are allowing some AI system or set of AI systems to make decisions autonomously, or de-facto-autonomously (e.g. a human is [...] --- Outline: (02:17) Now for some details and nuance: (07:19) When should we hand off trust and when should we hand off decisionmaking? --- First published: March 16th, 2026 Source: https://www.lesswrong.com/posts/YuMr6kbstuieQHkGj/types-of-handoff-to-ais --- Narrated by TYPE III AUDIO.

12 min
-1 DIA

“You can’t imitation-learn how to continual-learn” by Steven Byrnes

In this post, I’m trying to put forward a narrow, pedagogical point, one that comes up mainly when I’m arguing in favor of LLMs having limitations that human learning does not. (E.g. here, here, here.) See the bottom of the post for a list of subtexts that you should NOT read into this post, including “…therefore LLMs are dumb”, or “…therefore LLMs can’t possibly scale to superintelligence”. Some intuitions on how to think about “real” continual learning Consider an algorithm for training a Reinforcement Learning (RL) agent, like the Atari-playing Deep Q network (2013) or AlphaZero (2017), or think of within-lifetime learning in the human brain, which (I claim) is in the general class of “model-based reinforcement learning”, broadly construed. These are all real-deal full-fledged learning algorithms: there's an algorithm for choosing the next action right now, and there's one or more update rules for permanently changing some adjustable parameters (a.k.a. weights) in the model such that its actions and/or predictions will be better in the future. And indeed, the longer you run them, the more competent they get. When we think of “continual learning”, I suggest that those are good central examples to keep in mind. Here are [...] --- Outline: (00:35) Some intuitions on how to think about real continual learning (04:57) Why real continual learning cant be copied by an imitation learner (09:53) Some things that are off-topic for this post The original text contained 3 footnotes which were omitted from this narration. --- First published: March 16th, 2026 Source: https://www.lesswrong.com/posts/9rCTjbJpZB4KzqhiQ/you-can-t-imitation-learn-how-to-continual-learn --- Narrated by TYPE III AUDIO.

11 min
-1 DIA

“PSA: Predictions markets often have very low liquidity; be careful citing them.” by Eye You

I see people repeatedly make the mistake of referencing a very low liquidity prediction market and using it to make a nontrivial point. Usually the implication when a market is cited is that it's number should be taken somewhat seriously, that it's giving us a highly informed probability. Sometimes a market is used to analyze some event that recently occurred; reasoning here looks like "the market on outcome O was trading at X%, then event E happened and the market quickly moved to Y%, thus event E made O less/more likely." Who do I see make this mistake? Rationalists, both casually and gasp in blog posts. Scott Alexander and Zvi (and I really appreciate their work, seriously!) are guilty of this. I'll give a recent example from each of them. From Scott's Mantic Monday post on March 2: Having Your Own Government Try To Destroy You Is (At Least Temporarily) Good For Business On Friday, the Pentagon declared AI company Anthropic a “supply chain risk”, a designation never before given to an American firm. This unprecedented move was seen as an attempt to punish, maybe destroy the company. How effective was it? Anthropic isn’t publicly traded, so we [...] --- First published: March 16th, 2026 Source: https://www.lesswrong.com/posts/SrtoF6PcbHpzcT82T/psa-predictions-markets-often-have-very-low-liquidity-be --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

9 min
-1 DIA

“AICRAFT: DARPA-Funded AI Alignment Researchers — Applications Open” by Mike Vaiana, Diogo de Lucena, Judd Rosenblatt

AICRAFT: DARPA-Funded AI Alignment Researchers — Applications Open TL;DR: We hypothesize that most alignment researchers have more ideas than they have engineering bandwidth to test. AICRAFT is a DARPA-funded project that pairs researchers with a fully managed professional engineering team for two-week pilot sprints, designed specifically for high-risk ideas that might otherwise go untested. We will select 6 applicants and execute a 2 week pilot with each, the most promising pilot may be given a 3 month extension. This is the first MVP for engaging DARPA directly with the alignment community to our knowledge, and if successful can catalyze government scale investment in alignment R&D. Apply here. Applications close March 27, 2026 at 11 PM PST. What is AICRAFT? AICRAFT (Artificial Intelligence Control Research Amplification & Framework for Talent) is a DARPA-funded seedling project executed by AE Studio. The premise is straightforward: we hypothesize that alignment research could progress faster if the best researchers had more leverage. We believe that researchers currently are bottlenecked on either execution (i.e. they are doing the hands-on experiments themselves) or management (i.e. they are managing teams that are executing the work). Management is higher leverage but what if we could push that much [...] --- Outline: (00:15) AICRAFT: DARPA-Funded AI Alignment Researchers -- Applications Open (01:08) What is AICRAFT? (02:49) The Bigger Picture (03:56) Who should apply? (04:26) How it works (05:21) The application (06:11) FAQ --- First published: March 16th, 2026 Source: https://www.lesswrong.com/posts/nmMdtZveC38atLnDm/aicraft-darpa-funded-ai-alignment-researchers-applications --- Narrated by TYPE III AUDIO.

8 min

Ver tudo (250)

Audio narrations of LessWrong posts.

Autoria

LessWrong
Anos em atividade

2023 - 2026
Episódios

250
Classificação

Para todos
Site do programa

LessWrong (30+ Karma)

Tecnologia

Tecnologia

Semanalmente
Tecnologia

Tecnologia

Semanalmente
Tecnologia

Tecnologia

Duas vezes por semana
Tecnologia

Tecnologia

Diariamente
Ciência

Ciência

Atualizado a -1 dia
Investimentos

Investimentos

Atualizado a -1 dia
Sociedade e cultura

Sociedade e cultura

Semanalmente

LessWrong (30+ Karma)

“Requiem for a Transhuman Timeline” by Ihor Kendiukhov

“Adding Typos Made Haiku’s Accuracy Go Up” by bira

“LLMs as Giant Lookup-Tables of Shallow Circuits” by niplav, Claude+

“Medical Roundup #7” by Zvi

“Types of Handoff to AIs” by Daniel Kokotajlo

“You can’t imitation-learn how to continual-learn” by Steven Byrnes

“PSA: Predictions markets often have very low liquidity; be careful citing them.” by Eye You

“AICRAFT: DARPA-Funded AI Alignment Researchers — Applications Open” by Mike Vaiana, Diogo de Lucena, Judd Rosenblatt

Sobre

Informação

Talvez também goste

LessWrong (30+ Karma)

Episódios

“Requiem for a Transhuman Timeline” by Ihor Kendiukhov

“Adding Typos Made Haiku’s Accuracy Go Up” by bira

“LLMs as Giant Lookup-Tables of Shallow Circuits” by niplav, Claude+

“Medical Roundup #7” by Zvi

“Types of Handoff to AIs” by Daniel Kokotajlo

“You can’t imitation-learn how to continual-learn” by Steven Byrnes

“PSA: Predictions markets often have very low liquidity; be careful citing them.” by Eye You

“AICRAFT: DARPA-Funded AI Alignment Researchers — Applications Open” by Mike Vaiana, Diogo de Lucena, Judd Rosenblatt

Sobre

Informação

Talvez também goste