LessWrong (30+ Karma)

LessWrong

0,0 (0)
Tecnología
Diario

Audio narrations of LessWrong posts.

hace 51 min

″$1M AI x-risk grant round is live on grantmaking.ai - apply for funding, review applicants, or fund projects” by mbrooks, Mckiev

TLDR: what is the grant round? grantmaking.ai is launching a 1 million dollars grant round, distributing 5 thousand dollars to 50 thousand dollars per successful application to people and projects working to reduce x-risk from AI. Applications will be reviewed by Gavin Leech, Ryan Kidd, and Marcus Abramovitch. We aim to make all funding decisions by July 28th. Applications submitted by July 13th are guaranteed a priority review. You can still apply after July 13th, and we will make our best effort to review late submissions as long as funding remains. Grant applications will be mostly public, though we allow certain sensitive details to be kept private. Even if you are not applying, we invite you to join the platform to review and comment. We have set aside 100 thousand dollars of the budget to be given to top commenters as regranting budgets, so please share your thoughts and help us pick out awesome projects! Who are we? grantmaking.ai was initialized by Anton Makiievskyi, who is funding this round and brought the team together, built by Matt Brooks (lead dev) and Melissa Samworth (ui/ux), and advised by Austin Chen with Manifund handling grant distribution. Why we’re building this platform [...] --- Outline: (00:16) TLDR: what is the grant round? (01:16) Who are we? (01:35) Why we're building this platform & launching a grant round (02:53) What is grantmaking.ai, and who is it for? (04:13) Grant round details --- First published: June 29th, 2026 Source: https://www.lesswrong.com/posts/hDQZZzYkcipgaZfxy/usd1m-ai-x-risk-grant-round-is-live-on-grantmaking-ai-apply --- Narrated by TYPE III AUDIO.

6 min
hace 1 h

“Third-parties should focus on scrutinising systems cards” by Cleo Nardo

By default, I expect system cards will get worse, which would be bad. Some mechanisms could improve system cards, but I expect they will be outweighed. In any case, I think third-parties should focus on scrutinising system cards — this seems like a great activity for outsiders in the current strategic landscape. I'll sketch what that could look like, and offer some recommendations. It would be bad if system cards degraded. It's good for the outside community to have an accurate sense of the risks, so they can respond appropriately. For example: investing more resources into cyber-hardening, or other activities for making things go well.If labs felt pressure to evaluate the risks accurately, they'd be better incentivised to reduce them.If the risks were high enough, and a lab communicated that, then this might prompt drastic government action.It's very plausible that, if labs build misaligned AIs that take over, then most of the employees had a genuine but incorrect belief that the AIs wouldn't take over, based on evidence that was actually flimsy and misleading. So it's important that third-parties provide epistemic checks on the labs, and scrutinising system cards seems like a great mechanism for that. [...] --- Outline: (00:36) It would be bad if system cards degraded. (01:30) By default, I expect system cards to get worse, because... (04:30) Some mechanisms could improve system cards. (05:08) Third-parties should focus on scrutinising system cards. (05:46) I'll sketch what this might look like. (09:27) Shoddy system cards are better than no system cards. --- First published: June 29th, 2026 Source: https://www.lesswrong.com/posts/wixbZq4zTTtEWqtfe/third-parties-should-focus-on-scrutinising-systems-cards --- Narrated by TYPE III AUDIO.

11 min
hace 2 h

“WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense” by Zvi

The Wall Street Journal printed an outright false headline and heavily misleading story claiming this, which of course was uncritically amplified by the usual suspects. I post this now on its own so that we have a place to link to, to explain the situation. Headline News WSJ Headline (Obvious Nonsense): China Has Matched Anthropic in Cybersecurity, Resetting AI Race. That. Did. Not. Happen. The post even claims, explicitly, that Claude Opus 4.8 similarly ‘matches’ Claude Mythos, a claim which is even more obviously false. Shame upon the Wall Street Journal. I fear Gell-Mann Amnesia. If they can get something as important as this so completely wrong, what about everything else? I am skipping over the parts that involve accurate reporting, or minor quibbles. It seems important to focus on clearly debunking the central false claims. Alas, the mistakes made here very much rhyme with mistakes being made throughout all this by the White House, and that get latched onto by certain bad actors, who have played a large part in leaving us unprepared for the Mythos Moment. For a full understanding of GLM-5.2, which is indeed an impressive [...] --- Outline: (00:27) Headline News (02:09) What Makes Mythos Special (03:16) Going Over The Detailed Claims (07:38) One Helpful Note (08:18) The Overall Impression Is Extremely Wrong (08:48) All Of This Has Happened Before And Will Happen Again --- First published: June 29th, 2026 Source: https://www.lesswrong.com/posts/bpBYm5jiS4tpyzuDS/wsj-article-claiming-china-has-matched-anthropic-is-obvious --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

10 min
hace 2 h

“P(doom) is a Dumb Meme” by Max Harms

Look, I'm as much of a Rationalist with a special interest in AI x-risk as anyone. But oh my god do I hate talking about "P(doom)". When it first started showing up in the wake of ChatGPT, I assumed that it was floating around variously adjacent circles of faux-intellectuals, but surely everyone in my circles could see how braindead it was... right? (This post was partially inspired by a recent conversation with Liron about Doom Debates.[1]) I guess it's time for me to focus on a place where I'm shocked that everyone else is dropping the ball.[2] P(doom) is Hopelessly Vague Let's start with the ambiguity. Does "doom" mean... extinction? A lot of people think so! I have personally encountered people who think catastrophic harms from AI are likely, but the risks of all humans dying are low. They're like "Sure, 99.999% of humans might die from AI, but the AI will obviously want to keep thousands of humans alive for science and potential trade with aliens and stuff, so my P(doom) is approximately 0%." That might sound crazy. Surely you, dear reader, know exactly what "doom" means. You know, for example, which of these count as doom and [...] --- Outline: (00:45) P(doom) is Hopelessly Vague (04:09) Inside Views, Outside Views, and Likelihood Ratios (08:31) P(doom) is Fatalistic (13:03) Counterarguments (16:25) A Sense That More Is (Memetically) Possible The original text contained 8 footnotes which were omitted from this narration. --- First published: June 29th, 2026 Source: https://www.lesswrong.com/posts/6h7aAd4aw8YgCAbF6/p-doom-is-a-dumb-meme --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

18 min
hace 5 h

“GPT-5.6: The System Card” by Zvi

While we wait for a general release, the system card is the best hint as to what is going on with the new candidate for America's Next Top Model, GPT-5.6. This is only an OpenAI model card, so by my standards it's a light read. There's a lot of things that you get in an Anthropic card, that are missing in an OpenAI card. Overall, the card gives a clear and consistent impression that GPT-5.6-Sol is a substantial improvement over GPT-5.5, but still short of Mythos. OpenAI calls it a ‘step function better’ than GPT-5.5. That seems accurate. OpenAI: Sol is our new flagship and a step function better than GPT-5.5. Terra delivers performance competitive to GPT-5.5 at 2x lower cost. Luna is our most cost-efficient model, delivering strong capability at our lowest cost. Together, the GPT-5.6 family gives people and developers more choice in how they balance intelligence, speed, and cost. Once available, pricing for GPT-5.6-Sol will be $5/$30, the same as GPT-5.5. Terra is $2.5/$15, Luna is $1/$6. They claim it will be on Cerebras at 750 TPS, which is insanely fast. Capacity will be limited, at least at first. [...] --- Outline: (03:49) What's In A Name? (04:26) Fix This Code (07:08) Crossover Event Requested (07:43) Disallowed Content (3) (09:03) Avoiding Accidental Data-Destructive Actions (3.3) (09:29) Are You Sure? (3.4) (09:58) Jailbreaks (4.1) (10:14) Prompt Injection (4.2) (10:40) HealthBench (5.1) (11:00) Dynamic Mental Health Adversarial User Simulations (5.2) (12:21) Hallucinations (6) (12:50) Isolated Misaligned Actions (7.1) (13:10) Going Overboard (7.2) (18:11) Chain of Thought Evaluations (7.3) (19:18) Bias (8) (19:27) Preparedness (9) (20:15) Biological Risks (9.1.1) (22:15) Cybersecurity (9.1.2) (28:40) External Cyber Evaluation FrontierCyber from Irregular (9.1.2.5) (30:32) Cyber Conclusions (31:07) Recursive Self-Improvement (9.1.3) (32:22) METR Warns Us (9.1.3.6) (35:04) Everything Is Under Control (37:44) Metagaming (7.4) (40:17) Apollo Research and Sandbagging (43:09) Safeguards (9.3) (50:01) Better Not Call Sol Yet The original text contained 2 footnotes which were omitted from this narration. --- First published: June 28th, 2026 Source: https://www.lesswrong.com/posts/JFjNmPTbH8kL6xtp6/gpt-5-6-the-system-card --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

54 min
hace 5 h

“A reading list for generalists” by Dylan Bowman

I, along with many others in AI safety, believe there is a shortage of generalists in the community and that there exist many projects and efforts that by default will not happen unless they are owned by a strong generalist[1][2][3]. As someone who is a reasonably good generalist, I decided to assemble a reading list of the essays and blog posts that have personally helped me the most. I would love others to comment with pieces they think should be on this list. The crux of this reading list is the idea that if you’re working hard as a generalist on a project you care a lot about, then by rigorously applying the lessons from these documents you will improve more quickly than you otherwise would. By the numbers: I’ve attached 18 documents to start this reading list.The authors cited more than once are Paul Graham (5), Ben Kuhn (4), Ethan Perez (2), and Greg Brockman (2). Sam Altman and Eliezer Yudkowsky also have their fingerprints over a lot of the content.The items are 15 blog posts, 1 blog comment, 1 interview transcript in blog post form, and 1 book. Dispositional What characteristics should you [...] --- Outline: (01:15) Dispositional (01:41) Strategy (03:09) Project leadership (04:10) Interpersonal/organizational The original text contained 3 footnotes which were omitted from this narration. --- First published: June 28th, 2026 Source: https://www.lesswrong.com/posts/sH4cFDDjRdGrn3p2o/a-reading-list-for-generalists --- Narrated by TYPE III AUDIO.

5 min
hace 16 h

“What comes with cheap math?” by abramdemski

Thanks to conversations with Anson Berns, Gurkenglass, Roman Malov, Sahil, Sam Eisenstat, and others. Over the past two months, I've been doing a lot of "vibe research" (like vibe coding, but for research). Anson Berns started coming to my office hours, and we've been collaborating on a project modeling trust between logical inductors. In addition to talking once a week, we've been exchanging raw AI chats as well as AI-generated summaries of what has been done (the raw chats are nice because they allow me to generate my own AI summaries focusing on what I'm most curious about). I've been asking Claude to use Lean to verify everything, so there's a somewhat good chance there's real results of interest here, but I haven't (yet) been reading the Lean proofs (or even the theorem statements) -- instead I've just been chatting with AI about how the Lean proofs went and whether they really formalized what was claimed in english+latex, and focused on understanding the proofs myself in the same way I'd normally read a math paper. There have already been several times when this methodology has caught big gaps between what was claimed and what was verified in Lean, so [...] --- First published: June 28th, 2026 Source: https://www.lesswrong.com/posts/gS5skwXeeQdStwsPu/what-comes-with-cheap-math --- Narrated by TYPE III AUDIO.

7 min
hace 1 d

“Do LLMs Have Desires?” by Christopher Ackerman

Work conducted with Yujun Zhou (yzhou25@nd.edu) and supported by SPAR TL;DR: In paired-choice paradigms, LLMs report consistent preferences over outcomes (e.g., types and number of lives saved, types of policies enacted)Some have suggested that this indicates that LLMs have human-like value systemsWe design an experimental framework where LLMs are able to modulate their output quality based on prompt contextWe find that LLMs modulate their output quality in response to effort exhortations, role-play instructions, and harmfulness cues, but NOT to opportunities to achieve the outcomes they report preferring in the paired-choice experimentsWe suggest that paired-choice paradigms do not provide evidence that LLMs have human-like (i.e., behavior-motivating) value systems, and that our paradigm offers a way to measure the degree to which LLMs have desires Paper describing the work in detail here LLMs report that they prefer some things to others. In paired-choice experiments, where they are repeatedly presented with two options and asked to select the one that they prefer, coherent utility structures emerge: LLMs consistently report preferring certain types of things, and their choices reveal the ability to make quantitative tradeoffs between things and exhibit transitivity (e.g., if they choose A over B and [...] --- First published: June 28th, 2026 Source: https://www.lesswrong.com/posts/8GvYyqDuQDJnEAky3/do-llms-have-desires --- Narrated by TYPE III AUDIO. --- Images from the article: Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

14 min

Ver todo (250)

Audio narrations of LessWrong posts.

Creado por

LessWrong
Años en activo

2023 - 2026
Episodios

250
Clasificación

Explícito
Sitio web del podcast

LessWrong (30+ Karma)

Tecnología

Tecnología

Semanal
Tecnología

Tecnología

Bimensual
Política

Política

Semanal
Tecnología

Tecnología

Semanal
Tecnología

Tecnología

Bisemanal
Noticias

Noticias

Semanal
Tecnología

Tecnología

Semanal

LessWrong (30+ Karma)

″$1M AI x-risk grant round is live on grantmaking.ai - apply for funding, review applicants, or fund projects” by mbrooks, Mckiev

“Third-parties should focus on scrutinising systems cards” by Cleo Nardo

“WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense” by Zvi

“P(doom) is a Dumb Meme” by Max Harms

“GPT-5.6: The System Card” by Zvi

“A reading list for generalists” by Dylan Bowman

“What comes with cheap math?” by abramdemski

“Do LLMs Have Desires?” by Christopher Ackerman

Información

Ficha técnica

Quizá también te guste

LessWrong (30+ Karma)

Episodios

″$1M AI x-risk grant round is live on grantmaking.ai - apply for funding, review applicants, or fund projects” by mbrooks, Mckiev

“Third-parties should focus on scrutinising systems cards” by Cleo Nardo

“WSJ Article Claiming China Has Matched Anthropic Is Obvious Nonsense” by Zvi

“P(doom) is a Dumb Meme” by Max Harms

“GPT-5.6: The System Card” by Zvi

“A reading list for generalists” by Dylan Bowman

“What comes with cheap math?” by abramdemski

“Do LLMs Have Desires?” by Christopher Ackerman

Información

Ficha técnica

Quizá también te guste