IAPS Podcast

Institute for AI Policy and Strategy

The Institute for AI Policy & Strategy (IAPS) works to reduce risks related to the development & deployment of frontier AI systems. We focus on AI regulations, compute governance, international governance & China, and lab governance. This feed contains audio versions of some of our outputs. Learn more and read our work at iaps.ai.

Episodes

  1. 11/06/2023

    International AI safety dialogues: Benefits, risks, and best practices

    Events that bring together stakeholders from a range of countries to talk about AI safety (henceforth "safety dialogues") are a promising way to reduce large-scale risks from advanced AI systems. The goal of this report is to help safety dialogue organizers make these events as effective as possible at reducing such risks. We first identify “best practices” for organizers, drawing on research about comparable past events, literature about track II diplomacy, and our experience with international relations topics in AI governance. We then identify harmful outcomes that might result from safety dialogues, and ideas for how organizers can avoid them. Finally, we overview promising AI safety interventions that have already been identified and that might be particularly fruitful to discuss during a safety dialogue. --- Outline: (02:21) Best practices for organizers (07:52) Harmful outcomes to avoid (11:04) Interventions to discuss at safety dialogues (13:03) 1. Introduction (17:31) 2. Best practices for organizers (17:58) Method for identifying recommendations (21:43) “Best practice” recommendations (22:26) Culture of the safety dialogues (22:43) Make the dialogue non-partisan (24:20) Promote a spirit of collaborative truth-seeking (27:52) Create high-trust relationships between the participants (29:32) Create high-trust relationships between the participants and facilitators (30:30) Communicating about safety dialogues to outsiders (30:35) Maintain confidentiality about what was said by whom (31:26) Consider maintaining confidentiality about who is attending (32:49) Consider publishing a readout after the dialogue (34:55) Content of the event (34:59) Provide inputs to encourage participants down a productive path (36:32) Sometimes split participants into working groups (37:20) Selecting participants to invite (37:24) Choose participants who will engage constructively (38:47) Consider including participants from a range of countries (40:40) Consider the right level of participant “turnover” between dialogues (41:30) Logistical details (41:34) Choose a suitable location (42:58) Reduce language barriers (44:08) 3. Harmful outcomes to avoid (44:45) Promoting interest in AI capabilities disproportionately, relative to AI safety (47:16) Reducing the influence of safety concerns (50:53) Diffusing AI capabilities insights (54:12) 4. Interventions to discuss at safety dialogues (56:07) Overarching AI safety plan (56:25) Components of the plan (59:20) Role for safety dialogues in the overarching plan (01:01:11) Best practices for AI labs (01:03:59) Best practices for other relevant actors (01:05:51) Acknowledgements (01:06:07) Appendix: Additional detail on the “strand 1” case studies (01:06:13) Cases that we selected (01:08:18) Cases that we did not select The original text contained 91 footnotes which were omitted from this narration. --- First published: October 31st, 2023 Source: https://www.iaps.ai/research/international-ai-safety-dialogues

    1h 10m
  2. 10/19/2023

    Adapting cybersecurity frameworks to manage frontier AI risks: A defense-in-depth approach

    The complex and evolving threat landscape of frontier AI development requires a multi-layered approach to risk management (“defense-in-depth”). By reviewing cybersecurity and AI frameworks, we outline three approaches that can help identify gaps in the management of AI-related risks. First, a functional approach identifies essential categories of activities (“functions”) that a risk management approach should cover, as in the NIST Cybersecurity Framework (CSF) and AI Risk Management Framework (AI RMF). Second, a lifecycle approach instead assigns safety and security activities across the model development lifecycle, as in DevSecOps and the OECD AI lifecycle framework. Third, a threat-based approach identifies tactics, techniques, and procedures (TTPs) used by malicious actors, as in the MITRE ATT&CK and MITRE ATLAS databases. We recommend that frontier AI developers and policymakers begin by adopting the functional approach, given the existence of the NIST AI RMF and other supplementary guides, but also establish a detailed frontier AI lifecycle model and threat-based TTP databases for future use. --- Outline: (00:18) Executive Summary (09:23) 1 | Introduction (11:34) 2 | Defense-in-depth for frontier AI systems (12:07) 2.1 | Commonalities between domains implementing defense-in-depth (16:30) 2.2 | Defense-in-depth in nuclear power (20:20) 2.3 | Cybersecurity as a model for AI (20:25) 2.3.1 | Cybersecurity defense-in-depth in the 2000s and beyond (22:26) 2.3.2 | Complementary approaches to address evolving capabilities and threats (27:59) 2.3.3 | Benchmarking measures to the appropriate level of risk (30:55) 2.4 | Three approaches to AI defense-in-depth (35:05) 3 | Functional approach (37:44) 3.1 | What does this look like in cybersecurity? (40:52) 3.2 | Why take a functional approach? (42:00) 3.3 | Usage for frontier AI governance (42:54) 3.3.1 | The NIST AI RMF (44:30) 3.3.2 | Tailoring the AI RMF to frontier AI safety and security concerns (48:36) 3.3.3 | Providing detailed controls (51:06) 3.3.4 | Defense-in-depth using the NIST AI RMF (54:00) 3.4 | Limitations and future work (55:37) 4 | Lifecycle approach (57:32) 4.1 | What does this look like in cybersecurity? (58:24) 4.1.1 | Security Development Lifecycle (SDL) framework (01:00:12) 4.1.2 | The DevSecOps framework (01:02:02) 4.2 | Why take a lifecycle approach? (01:04:40) 4.3 | Usage for frontier AI governance (01:05:04) 4.3.1 | Existing descriptions of the AI development lifecycle (01:08:55) 4.3.2 | Proposed lifecycle framework (01:12:10) 4.3.3 | Discussion of proposed framework (01:12:15) “Shifting left” on AI safety and security (01:17:55) Deployment and post-deployment measures (01:19:22) 4.4 | Limitations and future work (01:21:29) 5 | Threat-based approach (01:23:27) 5.1 | What does this look like in cybersecurity? (01:26:11) 5.1.1 | An alternative threat-based approach: the kill chain (01:27:41) 5.2 | Why take a threat-based approach? (01:30:29) 5.3 | Usage for frontier AI governance (01:30:34) 5.3.1 | Existing work (01:34:05) 5.3.2 | Proposed threat-based approaches (01:35:24) An “effect on model” approach (01:37:21) An “effect on world” approach (01:40:15) 5.3.3 | Application to national critical functions (01:43:38) 5.4 | Limitations and future work (01:46:21) 6 | Evaluating and applying the suggested frameworks (01:46:34) 6.1 | Context for applying frameworks (01:48:56) 6.2 | Application to existing measures (01:51:59) 6.2.1 | Functional (01:56:13) 6.2.2 | Lifecycle (01:58:12) 7 | Conclusion (01:58:37) 7.1 | Overview of Next Steps (02:00:29) 7.2 | Recommendations (02:01:15) Acknowledgments (02:02:50) Appendix A: Relevant frameworks in nuclear reactor safety and cybersecurity (02:03:14) Appendix A-1: Defense-in-depth levels in nuclear reactor safety (02:04:18) Appendix A-2: Relevant cybersecurity frameworks (02:04:24) Defense-in-depth frameworks (02:07:11) NIST SP 800-172: Defense-in-depth against advanced persistent threats (02:10:06) Appendix A-3: The NIST Cybersecurity Framework (CSF) (02:12:42) Common uses of the NIST CSF (02:14:26) Appendix B: NIST AI Risk Management Framework (02:15:20) Appendix B-1: Govern (02:20:19) Appendix B-2: Map (02:25:35) Appendix B-3: Measure (02:31:04) Appendix B-4: Manage The original text contained 123 footnotes which were omitted from this narration. --- First published: October 13th, 2023 Source: https://www.iaps.ai/research/adapting-cybersecurity-frameworks

    2h 35m
  3. 10/04/2023

    AI chip smuggling into China: Potential paths, quantities, and countermeasures

    This report examines the prospect of large-scale smuggling of AI chips into China. AI chip smuggling into China is already happening to a limited extent and may involve greater quantities in the future. This is because demand for AI chips is increasing in China, while the US has restricted exports of cutting-edge chips going there. First, we describe paths such smuggling could take and estimate how many AI chips would be smuggled if China-linked actors were to aim for large-scale smuggling regimes. Second, we outline factors that affect whether and when China-linked actors would aim at large-scale smuggling regimes. Third, we propose six measures for reducing the likelihood of large-scale smuggling. --- Outline: (01:01) Short summary (03:45) Longer summary (17:51) How the US typically enforces export controls (29:21) Pathways and feasibility of large-scale smuggling (31:16) All-things-considered view (34:29) Routes into China (35:16) Summary table of potential reexport countries (37:39) Feasibility of surreptitiously procuring AI chips for reexport (38:27) Methods of obtaining AI chips (43:53) Challenges of large-scale smuggling (46:41) Four factors determining procurement feasibility (47:25) Demand for AI chips (53:01) Rule of law (54:16) Geopolitical alignment (58:12) Common language (59:05) Feasibility of surreptitiously transporting AI chips to China (01:00:20) Sea, land, and air transport (01:02:40) Clearing customs (01:04:27) Import/export volume (01:06:21) China's sides of its borders (01:07:47) Two possible smuggling regimes (01:09:28) Summary tables of estimates (01:12:45) Why the scenarios only concern Nvidia GPUs (01:13:52) Regime 1: Many shell companies buy small quantities from distributors (01:15:12) Enforcement of controls if this regime is attempted (01:18:27) Estimate (01:23:03) Regime 2: Few cloud provider fronts buy large quantities directly from Nvidia/OEMs (01:25:50) Enforcement of controls if this regime is attempted (01:29:31) Estimate (01:33:42) Will China-linked actors aim for large-scale AI chip smuggling? (01:35:08) AI chip smuggling today (01:36:54) Drivers of AI chip smuggling (01:44:24) Recommendations for US policymakers (01:47:12) Chip registry (01:53:13) Increasing BIS's budget (01:56:45) Stronger due diligence requirements for chip exporters (01:59:20) Licensing requirement for AI chip exports to key third countries (02:01:48) Interagency program to secure the AI supply chain (02:03:30) End-user verification programs in Southeast Asia (02:05:36) Discussion (02:06:25) Limitations (02:11:56) Further research (02:15:27) Acknowledgments --- First published: October 4th, 2023 Source: https://www.iaps.ai/research/ai-chip-smuggling-into-china

    2h 16m
  4. 09/30/2023

    Deployment corrections: An incident response framework for frontier AI models

    A comprehensive approach to addressing catastrophic risks from AI models should cover the full model lifecycle. This paper explores contingency plans for cases where pre-deployment risk management falls short: where either very dangerous models are deployed, or deployed models become very dangerous. Informed by incident response practices from industries including cybersecurity, we describe a toolkit of deployment corrections that AI developers can use to respond to dangerous capabilities, behaviors, or use cases of AI models that develop or are detected after deployment. We also provide a framework for AI developers to prepare and implement this toolkit. We conclude by recommending that frontier AI developers should (1) maintain control over model access, (2) establish or grow dedicated teams to design and maintain processes for deployment corrections, including incident response plans, and (3) establish these deployment corrections as allowable actions with downstream users. We also recommend frontier AI developers, standard-setting organizations, and regulators should collaborate to define a standardized industry-wide approach to the use of deployment corrections in incident response. Caveat: This work applies to frontier AI models that are made available through interfaces (e.g., APIs) that provide the AI developer or another upstream party means of maintaining control over access (e.g., GPT-4 or Claude). It does not apply to management of catastrophic risk from open-source models (e.g., BLOOM or Llama-2), for which the restrictions we discuss are largely unenforceable. --- Outline: (01:55) Executive Summary (09:29) 1. Challenge: Some catastrophic risks may emerge post-deployment (12:18) Case 1: Partial restrictions in response to user-discovered performance boost and misuse (15:07) 2. Proposed intervention: Deployment corrections (15:58) 2.1 Range of deployment corrections (18:36) 2.2 Additional considerations on emergency shutdown (20:19) Case 2: Full market removal due to improved prompt injection techniques (23:48) 3. Deployment correction framework (26:24) 3.0 Managing this process (29:24) 3.1 Preparation (39:22) 3.2 Monitoring and analysis (44:55) 3.3 Execution (49:15) 3.4 Recovery and follow-up (53:56) Case 3: Emergency shutdown in response to hidden compute-boosting behavior by model (58:05) 4. Challenges and mitigations to deployment corrections (58:29) 4.1 Distinctive challenges to incident response for frontier AI (58:50) 4.1.1 Threat identification (01:01:46) 4.1.2: Monitoring (01:06:35) 4.1.3 Incident response (01:08:59) 4.2: Disincentives and shortfalls of deployment corrections (01:09:23) 4.2.1: Potential harms to the AI company (01:10:33) 4.2.2: Coordination problems (01:13:02) 5. High-level recommendations (01:15:40) 6. Future research questions (01:20:23) 8. Acknowledgements (01:21:27) Appendix One. Compute as a complementary node of deployment oversight (01:22:24) Compute provider toolkit (01:25:31) Cloud providers and open-source models --- First published: September 30th, 2023 Source: https://www.iaps.ai/research/deployment-corrections

    1h 27m

About

The Institute for AI Policy & Strategy (IAPS) works to reduce risks related to the development & deployment of frontier AI systems. We focus on AI regulations, compute governance, international governance & China, and lab governance. This feed contains audio versions of some of our outputs. Learn more and read our work at iaps.ai.