271 episodios

LessWrong Curated Podcast LessWrong

- Tecnología

Audio version of the posts shared in the LessWrong Curated newsletter.

- 24 MAY 2024
AI companies aren’t really using external evaluators

AI companies aren’t really using external evaluators

New blog: AI Lab Watch. Subscribe on Substack.Many AI safety folks think that METR is close to the labs, with ongoing relationships that grant it access to models before they are deployed. This is incorrect. METR (then called ARC Evals) did pre-deployment evaluation for GPT-4 and Claude 2 in the first half of 2023, but it seems to have had no special access since then.[1] Other model evaluators also seem to have little access before deployment.Frontier AI labs' pre-deployment risk assessment ...
- 7 min
- 23 MAY 2024
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024

EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Part 13 of 12 in the Engineer's Interpretability Sequence. TL;DROn May 5, 2024, I made a set of 10 predictions about what the next sparse autoencoder (SAE) paper from Anthropic would and wouldn’t do. Today's new SAE paper from Anthropic was full of brilliant experiments and interesting insights, but it ultimately underperformed my expectations. I am beginning to be concerned that Anthropic's recent approach ...
- 6 min
- 21 MAY 2024
What’s Going on With OpenAI’s Messaging?

What’s Going on With OpenAI’s Messaging?

This is a quickly-written opinion piece, of what I understand about OpenAI. I first posted it to Facebook, where it had some discussion. Some arguments that OpenAI is making, simultaneously: OpenAI will likely reach and own transformative AI (useful for attracting talent to work there). OpenAI cares a lot about safety (good for public PR and government regulations). OpenAI isn’t making anything dangerous and is unlikely to do so in the future (good for public PR and government regulation...
- 6 min
- 21 MAY 2024
Language Models Model Us

Language Models Model Us

Produced as part of the MATS Winter 2023-4 program, under the mentorship of @Jessica RumbelowOne-sentence summary: On a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more. Introduction. Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing[1] about who we are, other than what it knows about users in general. But ...
- 29 min
- 21 MAY 2024
Jaan Tallinn’s 2023 Philanthropy Overview

Jaan Tallinn’s 2023 Philanthropy Overview

This is a link post.to follow up my philantropic pledge from 2020, i've updated my philanthropy page with 2023 results.in 2023 my donations funded $44M worth of endpoint grants ($43.2M excluding software development and admin costs) — exceeding my commitment of $23.8M (20k times $1190.03 — the minimum price of ETH in 2023).--- First published: May 20th, 2024 Source: https://www.lesswrong.com/posts/bjqDQB92iBCahXTAj/jaan-tallinn-s-2023-philanthropy-over...
- 51 segundos
- 21 MAY 2024
OpenAI: Exodus

OpenAI: Exodus

Previously: OpenAI: Facts From a Weekend, OpenAI: The Battle of the Board, OpenAI: Leaks Confirm the Story, OpenAI: Altman Returns, OpenAI: The Board Expands.Ilya Sutskever and Jan Leike have left OpenAI. This is almost exactly six months after Altman's temporary firing and The Battle of the Board, the day after the release of GPT-4o, and soon after a number of other recent safety-related OpenAI departures. Many others working on safety have also left recently. This is part of a longstanding ...
- 1h 24 min