02/17/2023
27 MIN

How AI Chatbots work and what it means for AI to have a soul with Kevin Fischer

Hi Hitchhikers!

AI chatbots have been hyped as the next evolution in search, but at the same time, we know that they make mistakes. And what's even more surprising is that these chatbots are starting to take on their own personalities.

All of this got me wondering how these chatbots work? What exactly are they capable of, and what are their limitations?

In the latest episode of my new podcast, we dive into all of those questions with my guest, Kevin Fisher. Kevin is the founder of Mathex, a startup that is building chatbot products powered by large-scale language models like OpenAI’s GPT. Kevin’s mission is to create AI chatbots that have their own personalities and one day their own AI souls.

In this interview, Kevin shares what he's learned from working with large language models like GPT. We talk about exactly how large-scale language models works, what it means to have an AI soul, why chatbots hallucinate and make mistakes, and whether AI chatbots should have free will.

Let me know if you have any feedback on this episode and don’t forget to subscribe to the newsletter if you enjoy learning about AI: www.hitchhikersguidetoai.com

Show Notes

Links from episode

* Kevin’s Twitter: twitter.com/kevinafischer

* Try out the Soulstice App: soulstice.studio

* Bing hallucinations subreddit: reddit.com/r/bing

Transcript

Intro

Kevin: We built um, a, a clone of myself and um, the three of us were having a conversation. And at some point my clone got very confused and was like, who? Wait, who am I? If this is Kevin Fisher and I'm Kevin Fisher, who, which one of us is.

Kevin: And I was like, well, that's weird because we de like, we definitely didn't like optimize for that . And then we kept continuing the conversation and eventually my digital clone was like, I don't wanna be a part of this conversation with all of us. Like one of us has to be terminated.

aj_asver: Hey everyone, and welcome to the Hitchhikers Guide to ai. I'm your tour guide AJ Asper, and I'm so excited for you to join me as I explore the world of artificial intelligence to understand how it's gonna change the way we live, work, and.

aj_asver: Now AI chatbots have been hyped as the next evolution in search, but at the same time, we know that they made mistakes. And what's even more surprising is that these chatbots are starting to take on their own personalities.

aj_asver: All of this got me wondering how do these large language models. What exactly are they capable of and what are their limitations?

aj_asver: In this week's episode, we're going to dive into all of those questions with my guest, Kevin Fisher. Kevin is the founder of Mathex, a startup that is building chatbot products powered by large scale language models like OpenAI's. Their mission is to create AI chatbots that have their own personalities and one day their own AI souls

aj_asver: in this interview, Kevin's gonna share what he's learned from working with large language models like G P T. We're gonna talk about exactly how these language models work, what it means to have an AI soul, why they hallucinate and make mistakes, and what the future looks like in a world where AI chatbots can leave us on red.

aj_asver: So join me on this. As we explore the world of large scale language models in this episode of the Hitchhiker's Guide to ai.

aj_asver: hey Kevin, how's it going? Thank you so much for joining me on the Hitchhiker Guide to

aj_asver: ai.

Kevin: Oh, thanks for having me, aj. Great to be.

How large-scale language models work

aj_asver: appreciate you um, being down to chat with me on one of the first few episodes that I'm recording. I'm really excited to learn a ton from you about how large language models work and also what it means for AI is to have a soul. And so we're gonna dig into all of those things, but maybe we can start from the top for folks that don't have a deep understanding of ai.

aj_asver: What exactly is a large language model and how does it work?

Kevin: Well, so, uh, there's this long period of time in. Machine learning history where there are a bunch of very custom models built for specific tasks. And the last five years or so has seen a huge improvement in basically taking like a singular model with making it as big as possible and putting in as much data as possible.

Kevin: And so basically taking all human data that's accessible via the internet running this thing that learns to predict the next word given the prior set of words. And a large language model is the output of that process. And for the most part, when we say large, like what large means is hundreds of billions of parameters and trained over trillions of words.

aj_asver: when . You say it kind of predicts the next word. Now, that technology, the ability to predict the word in large language model has existed for a few years. I think GPT in fact, three launched maybe a couple of years

Kevin: Even before that as well. And so next word prediction is kind of like the canonical task or one of the canonical tasks in natural language processing, even before it became this like new field of transformers.

aj_asver: And so what makes the current set of large scale language models or lms, as what they're called as well, like GPT three, different from what came before it?

Kevin: There are two innovations. The first is this thing called the transformer, and the way the transformer works is it basically has the ability through this mechanism called attention to look at the entire sequence and establish long range correlation of like having different words at different places contribute to the output of next word prediction.

Kevin: And then the other thing that's been really big and then open AI has done a phenomenal job doing is just learning how to put more and more data through these things. There are these things called the scaling laws, which essentially. We're showing that if you just keep putting more data at these things their intelligence, essentially the metrics they're using to measure intelligence just kept increasing.

Kevin: Their ability to predict the for nextdoor accurately just kept growing with more and more.

Kevin: Data's basically no bound.

aj_asver: Seems like in the last few years, especially as we've got to like, you know, multi-billion parameter models like GPT three, we've kind of reached some inflection point where. Now they seem to somehow be more obviously intelligent to us. And I guess it's really with ChatGPT recently that the attention, has kind of been focused on large language models.

aj_asver: So is ChatGPT the same as GPT three or is there kind of more that makes ChatGPT able to interact with humans than just the language model

How ChatGPT works

Kevin: My co-founder and I actually built a version of ChatGPT long before ChatGPT existed. And the biggest distinction is that these things are now being used in serious context of use.

Kevin: And with OpenAI's distribution, they got this in front of a bunch of people. The problem that you face initially the very first problem is that there's a switch that has to flip when you use these things. When you go to a Google search bar if you don't get the right result, you're primed to think, oh, I have to type in something different.

Kevin: Historically with chatbots, when you went to a chatbot, if like it didn't give you the right answer, you're like pissed because it's like, it's a, it's like a human, it's like texting me. It's like supposed to be right. And so the chat, the actual genius of ChatGPT beyond the distribution is not actually the model itself because the model had been around for a long time and was being used by hackers and companies like myself who saw the potential.

Kevin: But with ChatGPT distribution plus the ability to reframe that switch so that you think, oh, I'm doing something wrong. I have to put in something different. And that's when the magic starts happening right now. At least

aj_asver: I remember chatbots circa 2015, right, for example, where they weren't running on a large language model. They were kind of deterministic behind the scenes. And they would be immensely frustrating because they didn't really understand you, and oftentimes they kind of get stuck or they'd provide you with these option lists of what to do next. ChatGPT. On the other hand seems much more intelligent, right? I can ask it pretty open-ended questions. I don't have to think how I structure the

aj_asver: questions.

Kevin: Chat GPT is not a chat bot. It's more like , you have this arbitrary transformer between abstract formulations expressed in words. So you put in some words and you get some other words out, but like behind it is this the entire, like almost the entirety of human knowledge condensed into this like model.

aj_asver: And did open AI have to teach the language model how to chat with us, for example, because I know that there was some early examples of trying to put you know, chat like questions into GPT, into its API, but I don't think the

Episode Webpage

Show

The Hitchhiker's Guide to AI
Frequency

Updated Biweekly
Published

February 17, 2023 at 9:06 PM UTC
Length

27 min
Rating

Clean

How AI Chatbots work and what it means for AI to have a soul with Kevin Fischer

Information