52 мин.

74. Ethan Perez - Making AI safe through debate Towards Data Science

- Технологии

Most AI researchers are confident that we will one day create superintelligent systems — machines that can significantly outperform humans across a wide variety of tasks.

If this ends up happening, it will pose some potentially serious problems. Specifically: if a system is superintelligent, how can we maintain control over it? That’s the core of the AI alignment problem — the problem of aligning advanced AI systems with human values.

A full solution to the alignment problem will have to involve at least two things. First, we’ll have to know exactly what we want superintelligent systems to do, and make sure they don’t misinterpret us when we ask them to do it (the “outer alignment” problem). But second, we’ll have to make sure that those systems are genuinely trying to optimize for what we’ve asked them to do, and that they aren’t trying to deceive us (the “inner alignment” problem).

Creating systems that are inner-aligned and superintelligent might seem like different problems — and many think that they are. But in the last few years, AI researchers have been exploring a new family of strategies that some hope will allow us to achieve both superintelligence and inner alignment at the same time. Today’s guest, Ethan Perez, is using these approaches to build language models that he hopes will form an important part of the superintelligent systems of the future. Ethan has done frontier research at Google, Facebook, and MILA, and is now working full-time on developing learning systems with generalization abilities that could one day exceed those of human beings.