“AI companies are unlikely to make high-assurance safety cases if timelines are short” by ryan_greenblatt
One hope for keeping existential risks low is to get AI companies to (successfully) make high-assurance safety cases: structured and auditable arguments that an AI system is very unlikely to result in existential risks given how it will be deployed.[1] Concretely, once AIs are quite powerful, high-assurance safety cases would require making a thorough argument that the level of (existential) risk caused by the company is very low; perhaps they would require that the total chance of existential risk over the lifetime of the AI company[2] is less than 0.25%[3][4].
The idea of making high-assurance safety cases (once AI systems are dangerously powerful) is popular in some parts of the AI safety community and a variety of work appears to focus on this. Further, Anthropic has expressed an intention (in their RSP) to "keep risks below acceptable levels"[5] and there is a common impression that Anthropic would pause [...]
---
Outline:
(03:19) Why are companies unlikely to succeed at making high-assurance safety cases in short timelines?
(04:14) Ensuring sufficient security is very difficult
(04:55) Sufficiently mitigating scheming risk is unlikely
(09:35) Accelerating safety and security with earlier AIs seems insufficient
(11:58) Other points
(14:07) Companies likely wont unilaterally slow down if they are unable to make high-assurance safety cases
(18:26) Could coordination or government action result in high-assurance safety cases?
(19:55) What about safety cases aiming at a higher risk threshold?
(21:57) Implications and conclusions
The original text contained 20 footnotes which were omitted from this narration.
---
First published:
January 23rd, 2025
Source:
https://www.lesswrong.com/posts/neTbrpBziAsTH5Bn7/ai-companies-are-unlikely-to-make-high-assurance-safety
---
Narrated by TYPE III AUDIO.
信息
- 节目
- 频率一周一更
- 发布时间2025年1月24日 UTC 10:58
- 长度25 分钟
- 分级儿童适宜