A Medium publication sharing concepts, ideas, and codes.
79. Ryan Carey - What does your AI want?
AI safety researchers are increasingly focused on understanding what AI systems want. That may sound like an odd thing to care about: after all, aren’t we just programming AIs to want certain things by providing them with a loss function, or a number to optimize?
Well, not necessarily. It turns out that AI systems can have incentives that aren’t necessarily obvious based on their initial programming. Twitter, for example, runs a recommender system whose job is nominally to figure out what tweets you’re most likely to engage with. And while that might make you think that it should be optimizing for matching tweets to people, another way Twitter can achieve its goal is by matching people to tweets — that is, making people easier to predict, by nudging them towards simplistic and partisan views of the world. Some have argued that’s a key reason that social media has had such a divisive impact on online political discourse.
So the incentives of many current AIs already deviate from those of their programmers in important and significant ways — ways that are literally shaping society. But there’s a bigger reason they matter: as AI systems continue to develop more capabilities, inconsistencies between their incentives and our own will become more and more important. That’s why my guest for this episode, Ryan Carey, has focused much of his research on identifying and controlling the incentives of AIs. Ryan is a former medical doctor, now pursuing a PhD in machine learning and doing research on AI safety at Oxford University’s Future of Humanity Institute.
78. Melanie Mitchell - Existential risk from AI: A skeptical perspective
As AI systems have become more powerful, an increasing number of people have been raising the alarm about its potential long-term risks. As we’ve covered on the podcast before, many now argue that those risks could even extend to the annihilation of our species by superhuman AI systems that are slightly misaligned with human values.
There’s no shortage of authors, researchers and technologists who take this risk seriously — and they include prominent figures like Eliezer Yudkowsky, Elon Musk, Bill Gates, Stuart Russell and Nick Bostrom. And while I think the arguments for existential risk from AI are sound, and aren’t widely enough understood, I also think that it’s important to explore more skeptical perspectives.
Melanie Mitchell is a prominent and important voice on the skeptical side of this argument, and she was kind enough to join me for this episode of the podcast. Melanie is the Davis Professor of complexity at the Santa Fe Institute, a Professor of computer science at Portland State University, and the author of Artificial Intelligence: a Guide for Thinking Humans — a book in which she explores arguments for AI existential risk through a critical lens. She’s an active player in the existential risk conversation, and recently participated in a high-profile debate with Stuart Russell, arguing against his AI risk position.
77. Josh Fairfield - AI advances, but can the law keep up?
Powered by Moore’s law, and a cluster of related trends, technology has been improving at an exponential pace across many sectors. AI capabilities in particular have been growing at a dizzying pace, and it seems like every year brings us new breakthroughs that would have been unimaginable just a decade ago. GPT-3, AlphaFold and DALL-E were developed in the last 12 months — and all of this in a context where the leading machine learning model has been increasing in size tenfold every year for the last decade.
To many, there’s a sharp contrast between the breakneck pace of these advances and the rate at which the laws that govern technologies like AI evolves. Our legal systems are chock full of outdated laws, and politicians and regulators often seem almost comically behind the technological curve. But while there’s no question that regulators face an uphill battle in trying to keep up with a rapidly changing tech landscape, my guest today thinks they have a good shot of doing so — as long as they start to think about the law a bit differently.
His name is Josh Fairfield, and he’s a law and technology scholar and former director of R&D at pioneering edtech company Rosetta Stone. Josh has consulted with U.S. government agencies, including the White House Office of Technology and the Homeland Security Privacy Office, and literally wrote a book about the strategies policymakers can use to keep up with tech like AI.
76. Stuart Armstrong - AI: Humanity's Endgame?
Paradoxically, it may be easier to predict the far future of humanity than to predict our near future.
The next fad, the next Netflix special, the next President — all are nearly impossible to anticipate. That’s because they depend on so many trivial factors: the next fad could be triggered by a viral video someone filmed on a whim, and well, the same could be true of the next Netflix special or President for that matter.
But when it comes to predicting the far future of humanity, we might oddly be on more solid ground. That’s not to say predictions can be made with confidence, but at least they can be made based on economic analysis and first principles reasoning. And most of that analysis and reasoning points to one of two scenarios: we either attain heights we’ve never imagined as a species, or everything we care about gets wiped out in a cosmic scale catastrophe.
Few people have spent more time thinking about the possible endgame of human civilization as my guest for this episode of the podcast, Stuart Armstrong. Stuart is a Research Fellow at Oxford University’s Future of Humanity Institute, where he studies the various existential risks that face our species, focusing most of his work specifically on risks from AI. Stuart is a fascinating and well-rounded thinker with a fresh perspective to share on just about everything you could imagine, and I highly recommend giving the episode a listen.
75. Georg Northoff - Consciousness and AI
For the past decade, progress in AI has mostly been driven by deep learning — a field of research that draws inspiration directly from the structure and function of the human brain. By drawing an analogy between brains and computers, we’ve been able to build computer vision, natural language and other predictive systems that would have been inconceivable just ten years ago.
But analogies work two ways. Now that we have self-driving cars and AI systems that regularly outperform humans at increasingly complex tasks, some are wondering whether reversing the usual approach — and drawing inspiration from AI to inform out approach to neuroscience — might be a promising strategy. This more mathematical approach to neuroscience is exactly what today’s guest, Georg Nortoff, is working on. Georg is a professor of neuroscience, psychiatry, and philosophy at the University of Ottawa, and as part of his work developing a more mathematical foundation for neuroscience, he’s explored a unique and intriguing theory of consciousness that he thinks might serve as a useful framework for developing more advanced AI systems that will benefit human beings.
74. Ethan Perez - Making AI safe through debate
Most AI researchers are confident that we will one day create superintelligent systems — machines that can significantly outperform humans across a wide variety of tasks.
If this ends up happening, it will pose some potentially serious problems. Specifically: if a system is superintelligent, how can we maintain control over it? That’s the core of the AI alignment problem — the problem of aligning advanced AI systems with human values.
A full solution to the alignment problem will have to involve at least two things. First, we’ll have to know exactly what we want superintelligent systems to do, and make sure they don’t misinterpret us when we ask them to do it (the “outer alignment” problem). But second, we’ll have to make sure that those systems are genuinely trying to optimize for what we’ve asked them to do, and that they aren’t trying to deceive us (the “inner alignment” problem).
Creating systems that are inner-aligned and superintelligent might seem like different problems — and many think that they are. But in the last few years, AI researchers have been exploring a new family of strategies that some hope will allow us to achieve both superintelligence and inner alignment at the same time. Today’s guest, Ethan Perez, is using these approaches to build language models that he hopes will form an important part of the superintelligent systems of the future. Ethan has done frontier research at Google, Facebook, and MILA, and is now working full-time on developing learning systems with generalization abilities that could one day exceed those of human beings.