LessWrong Curated Podcast LessWrong
-
- Tecnología
Audio version of the posts shared in the LessWrong Curated newsletter.
-
AI companies aren’t really using external evaluators
New blog: AI Lab Watch. Subscribe on Substack.Many AI safety folks think that METR is close to the labs, with ongoing relationships that grant it access to models before they are deployed. This is incorrect. METR (then called ARC Evals) did pre-deployment evaluation for GPT-4 and Claude 2 in the first half of 2023, but it seems to have had no special access since then.[1] Other model evaluators also seem to have little access before deployment.Frontier AI labs' pre-deployment risk assessment ...
-
EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.Part 13 of 12 in the Engineer's Interpretability Sequence. TL;DROn May 5, 2024, I made a set of 10 predictions about what the next sparse autoencoder (SAE) paper from Anthropic would and wouldn’t do. Today's new SAE paper from Anthropic was full of brilliant experiments and interesting insights, but it ultimately underperformed my expectations. I am beginning to be concerned that Anthropic's recent approach ...
-
What’s Going on With OpenAI’s Messaging?
This is a quickly-written opinion piece, of what I understand about OpenAI. I first posted it to Facebook, where it had some discussion. Some arguments that OpenAI is making, simultaneously: OpenAI will likely reach and own transformative AI (useful for attracting talent to work there). OpenAI cares a lot about safety (good for public PR and government regulations). OpenAI isn’t making anything dangerous and is unlikely to do so in the future (good for public PR and government regulation...
-
Language Models Model Us
Produced as part of the MATS Winter 2023-4 program, under the mentorship of @Jessica RumbelowOne-sentence summary: On a dataset of human-written essays, we find that gpt-3.5-turbo can accurately infer demographic information about the authors from just the essay text, and suspect it's inferring much more. Introduction. Every time we sit down in front of an LLM like GPT-4, it starts with a blank slate. It knows nothing[1] about who we are, other than what it knows about users in general. But ...
-
Jaan Tallinn’s 2023 Philanthropy Overview
This is a link post.to follow up my philantropic pledge from 2020, i've updated my philanthropy page with 2023 results.in 2023 my donations funded $44M worth of endpoint grants ($43.2M excluding software development and admin costs) — exceeding my commitment of $23.8M (20k times $1190.03 — the minimum price of ETH in 2023).--- First published: May 20th, 2024 Source: https://www.lesswrong.com/posts/bjqDQB92iBCahXTAj/jaan-tallinn-s-2023-philanthropy-over...
-
OpenAI: Exodus
Previously: OpenAI: Facts From a Weekend, OpenAI: The Battle of the Board, OpenAI: Leaks Confirm the Story, OpenAI: Altman Returns, OpenAI: The Board Expands.Ilya Sutskever and Jan Leike have left OpenAI. This is almost exactly six months after Altman's temporary firing and The Battle of the Board, the day after the release of GPT-4o, and soon after a number of other recent safety-related OpenAI departures. Many others working on safety have also left recently. This is part of a longstanding ...