LessWrong (30+ Karma)

LessWrong

0,0 (0)
CÔNG NGHỆ
HẰNG NGÀY

Audio narrations of LessWrong posts.

1 GIỜ TRƯỚC

“Kimi K2 Thinking” by Zvi

I previously covered Kimi K2, which now has a new thinking version. As I said at the time back in July, price in that the thinking version is coming. Is it the real deal? That depends on what level counts as the real deal. It's a good model, sir, by all accounts. But there have been fewer accounts than we would expect if it was a big deal, and it doesn’t fall into any of my use cases. Introducing K2 Thinking Kimi.ai: Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. SOTA on HLE (44.9%) and BrowseComp (60.2%) Executes up to 200 – 300 sequential tool calls without human interference Excels in reasoning, agentic search, and coding 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on http://kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. API here, Tech blog here, Weights and code here. (Pliny jailbreak here.) It's got 1T parameters, and Kimi and [...] --- Outline: (00:34) Introducing K2 Thinking (02:15) Writing Quality (03:07) Agentic Tool Use (04:06) Overall (05:08) Are Benchmarks Being Targeted? (06:23) Just As Good Syndrome (07:02) Reactions (09:59) Otherwise It Has Been Strangely Quiet --- First published: November 11th, 2025 Source: https://www.lesswrong.com/posts/SLrWSyS3FypLKyRL6/kimi-k2-thinking --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

11 phút
2 GIỜ TRƯỚC

“France is ready to stand alone” by Lucie Philippon

First part of a series of article on French AI Policy that I’m currently writing as part of the Inkhaven Residency. For three centuries, France has stood among the great powers of the world, and there it wants to stay. Now, far from the heyday of La Belle Époque, where the French Empire stood shoulder to shoulder with the British Empire, France has resisted the pressures to align too closely with the world superpowers, lest it become just a vassal state. French pride mandates that French sovereignty must be preserved. As World War II came to an end, the two blocs of the Cold War emerged. Despite the pressures to align, France was not willing to trust the US security guarantees. It needed a seat at the table. France secured its own permanent seat at the UN security council in 1945, its own nuclear weapons in 1960, and its energy independence through nuclear power in the 70s, even withdrawing from part of NATO in 1966. It seems that history may have proven them right; as Trump hints that the US might not come to defend Europe, France stands secure with its own nuclear deterrence force, even suggesting France might [...] --- First published: November 11th, 2025 Source: https://www.lesswrong.com/posts/rRXoZDYuGDrqhFH6i/france-is-ready-to-stand-alone --- Narrated by TYPE III AUDIO.

4 phút
2 GIỜ TRƯỚC

“Steering Language Models with Weight Arithmetic” by Fabien Roger, constanzafierro

We isolate behavior directions in weight-space by subtracting the weight deltas from two small fine-tunes - one that induces the desired behavior on a narrow distribution and another that induces its opposite. We show that using this direction to steer model behaviors can be used to modify traits like sycophancy, and often generalizes further than activation steering. Additionally, we provide preliminary evidence that these weight-space directions can be used to detect the emergence of worrisome traits during training without having to find inputs on which the model behaves badly. Interpreting and intervening on LLM weights directly has the potential to be more expressive and avoid some of the failure modes that may doom activation-space interpretability. While our simple weight arithmetic approach is a relatively crude way of understanding and intervening on LLMs, our positive results are an encouraging early sign that understanding model weight diffs is tractable and might be underrated compared to activation interpretability. 📄 Paper, 💻 Code Research done as part of MATS. Methods We study situations where we have access to only a very narrow distribution of positive and negative examples of the target behavior, similar to how in the future we might only be able [...] --- Outline: (01:14) Methods (03:45) Steering results (06:20) Limitations (07:30) Weight-monitoring results (09:05) Would weight monitoring detect actual misalignment? (10:19) Future work --- First published: November 11th, 2025 Source: https://www.lesswrong.com/posts/HYTbakdHpxfaCowYp/steering-language-models-with-weight-arithmetic --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

12 phút
5 GIỜ TRƯỚC

“The problem of graceful deference” by TsviBT

Crosspost from my blog. Moral deference Sometimes when I bring up the subject of reprogenetics, people get uncomfortable. "So you want to do eugenics?", "This is going to lead to inequality.", "Parents are going to pressure their kids.". Each of these statements does point at legitimate concerns. But also, the person is uncomfortable, and they don't necessarily engage with counterpoints. And, even if they acknowledge that their stated concern doesn't make sense, they'll still be uncomfortable—until they think of another concern to state. This behavior is ambiguous—I don't know what underlies the behavior in any given case. E.g. it could be that they're intent on pushing against reprogenetics regardless of the arguments they say, or it could be that they have good and true intuitions that they haven't yet explicitized. And in any case, argument and explanation is usually best. Still, I often get the impression that, fundamentally, what's actually happening in their mind is like this: Reprogenetics... that's genetic engineering... Other people are against that... I don't know about it / haven't thought about it / am not going to stick my neck out about it... So I'm going to [...] --- Outline: (00:13) Moral deference (02:30) Correlated failures (05:04) The open problem --- First published: November 11th, 2025 Source: https://www.lesswrong.com/posts/jzy5qqRuqA9iY7Jxu/the-problem-of-graceful-deference-1 --- Narrated by TYPE III AUDIO.

8 phút
13 GIỜ TRƯỚC

“How likely is dangerous AI in the short term?” by Nikola Jurkovic

How large of a breakthrough is necessary for dangerous AI? In order to cause a catastrophe, an AI system would need to be very competent at agentic tasks[1]. The best metric of general agentic capabilities is METR's time horizon. The time horizon measures the length of well-specified software tasks AI systems can do, and is grounded in human baselines, which means AI performance can be closely compared to human performance. Causing a catastrophe[2] is very difficult. It would likely take many decades, or even centuries, of skilled human labor. Let's use one year of human labor as a lower bound on how difficult it is. This means that AI systems will need to at least have a time horizon of one work-year (2000 hours) in order to cause a catastrophe. Current AIs have a time horizon of 2 hours, which means it's 1000x lower than the time horizon necessary to cause a catastrophe. This presents a pretty large buffer. Currently, the time horizon is doubling roughly every half-year. That means that a 1000x increase would take roughly 5 years at the current rate of progress. So, in order for AI to reach a time horizon of 1 work-year within [...] --- Outline: (00:11) How large of a breakthrough is necessary for dangerous AI? (02:04) AI breakthroughs of the recent past (02:27) Case 1: Transformers (03:54) Case 2: AlphaFold (04:30) What is the probability of 1-year time horizons in the next 6 months? (05:10) Narrowly superhuman AI leading to generally competent AI (06:27) Would we notice a massive capabilities increase? (07:44) Conclusion The original text contained 3 footnotes which were omitted from this narration. --- First published: November 11th, 2025 Source: https://www.lesswrong.com/posts/B5xQwkmWL5wmFNZkX/how-likely-is-dangerous-ai-in-the-short-term --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

9 phút
15 GIỜ TRƯỚC

“Questioning the Requirements” by habryka

Context: Every Sunday I write a mini-essay about an operating principle of Lightcone Infrastructure that I want to remind my team about. I've been doing this for about 3 months, so we have about 12 mini essays. This is the first in a sequence I will add to daily with slightly polished versions of these essays. The first principle, and the one that stands before everything else, is to question the requirements. Here's how Musk describes that principle: Question every requirement. Each should come with the name of the person who made it. You should never accept a requirement that came from a department, such as legal ... you need to know the name of the real person who made the requirement. Then, you should question it, no matter how smart that person is. Requirements from smart people are the most dangerous, because people are less likely to question them. Always do so, even if the requirement comes from me [Musk]. Then make the requirements less dumb. Here's some of how I think about it: plans are made of smaller plans, inside their steps to achieve 'em. And smaller plans have lesser plans, and so ad infinitum. [...] --- First published: November 11th, 2025 Source: https://www.lesswrong.com/posts/BECDxh5jKjcmxs7hw/questioning-the-requirements --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

6 phút
22 GIỜ TRƯỚC

“Andrej Karpathy on LLM cognitive deficits” by Nina Panickssery

Excerpt from Dwarkesh Patel's interview with Andrej Karpathy that I think is valuable for LessWrong-ers to read. I think he's basically correct. Emphasis in bold is mine. Andrej Karpathy 00:29:53 I guess I built the repository over a period of a bit more than a month. I would say there are three major classes of how people interact with code right now. Some people completely reject all of LLMs and they are just writing by scratch. This is probably not the right thing to do anymore. The intermediate part, which is where I am, is you still write a lot of things from scratch, but you use the autocomplete that's available now from these models. So when you start writing out a little piece of it, it will autocomplete for you and you can just tap through. Most of the time it's correct, sometimes it's not, and you edit it. But you’re still very much the architect of what you’re writing. Then there's the vibe coding: “Hi, please implement this or that,” enter, and then let the model do it. That's the agents. I do feel like the agents work in very specific settings, and I would use them [...] --- First published: November 10th, 2025 Source: https://www.lesswrong.com/posts/qBsj6HswdmP6ahaGB/andrej-karpathy-on-llm-cognitive-deficits --- Narrated by TYPE III AUDIO.

9 phút
1 NGÀY TRƯỚC

[Linkpost] “Untitled Draft” by Gabriel Alfour

This is a link post. I basically fully endorse the full article. I like the concluding bit too. This brings me to my own contribution to the already-full genre of recommendations for people who want to contribute to AI safety: Don't work for a company that's making frontier fully-autonomous AI capabilities progress even faster Don't live in the San Francisco Bay Area. Cheers, Gabe --- First published: November 10th, 2025 Source: https://www.lesswrong.com/posts/vHahdxTCADMqZ6oMF/untitled-draft-33gp Linkpost URL:https://vitalik.eth.limo/general/2025/11/07/galaxybrain.html --- Narrated by TYPE III AUDIO.

1 phút

Xem tất cả (250)

Audio narrations of LessWrong posts.

Nhà sáng tạo

LessWrong
Năm hoạt động

2023 - 2025
Tập

250
Xếp hạng

Sạch
Bản quyền

© 2025 All rights reserved
Trang web chương trình

LessWrong (30+ Karma)

Dinh dưỡng

Dinh dưỡng

1 ngày trước
Khoa học

Khoa học

3 thg 10
Sức khỏe & thể dục

Sức khỏe & thể dục

Hằng tuần
Sức khỏe & thể dục

Sức khỏe & thể dục

Hằng tuần
Y học thay thế

Y học thay thế

Hằng tuần
Công nghệ

Công nghệ

Một tuần hai lần
Y học

Y học

Hằng tuần