I believe chatbots understand some of what they say. Let me explain. (Patreon)

Published:

2023-03-11 13:00:03

Imported:

2023-03

Content

[This is a transcript.]

Does an artificially intelligent chatbot understand what it’s chatting about? A year ago, I’d have answered this question with “clearly not”. It’s just a “turbocharged autocomplete” or a “stochastic parrot,” as people more eloquent than me have put it. Though for all I know, they too might be chatbots.

But I’ve now arrived at the conclusion that the AIs that we use today do understand what they’re doing, if not very much of it. I’m not saying this just to be controversial, I actually believe it, I believe. Tough I have a feeling I might come to regret this video.

I got hung up on this question not because I care so much about chatbots, but because it echoes the often-made claim that “no one understands quantum mechanics”. But if we can use quantum mechanics, then doesn’t that mean that we understand it, at least to some extent. And consequently, if an AI can use language, then doesn’t that mean it understands it, at least to some extend? What do we mean by “understanding”? Does Chat GPT understand quantum mechanics? And will AIs soon be conscious? That’s what we’ll talk about today.

The question whether a computer program understands what it’s doing certainly isn’t new. In 1980, the American philosopher John Searle argued that the answer is “no”, using a thought experiment that’s become known as the “Chinese Room”.

Searle imagines himself in a windowless room with a rulebook and a drop box. If someone drops him a note written in Chinese, he looks up the symbols in his rulebook. The rulebook gives him an English translation which he returns as an answer through a slit in the door, no doubt drawing on the every-day experience of a professor of philosophy.

Searle argues that the person outside the room might believe that there’s someone inside who understands Chinese. But really, he still doesn’t understand a word of it, he’s just following the rules he’s been given. Searle argues that a computer program works like that, without any true understanding, just following rules.

There are two standard objections that people bring forward against Searle’s argument. One is that the system which understands Chinese isn’t just the person inside the room but the person including the rule book. So saying that the person doesn’t understand Chinese might be correct but doesn’t answer the question because in Searle’s analogy the person alone doesn’t represent the computer program.

Another objection is that it might well be correct that Searle and his rulebook don’t understand Chinese, but that’s because the input is so limited. Language lacks the physical information that we have learned to associate with words. A software that had the same physical information could develop understanding as we do. Unless of course we live in a computer simulation in which case you can file complaints using the contact form in the bottom right corner of your frontal lobe.

I think both of these objections miss the point, but before I explain that I want to introduce you to the Quantum Room.

Quantum mechanics works pretty much like Searle’s Chinese room. It’s a rulebook, a set of equations with instructions for how to use them. You give me a question, I look into my rulebook that I keep in my windowless room, and I return an answer to you through the slit in the door. Do I understand quantum mechanics? Searle would probably argue “no”.

Indeed, for the most part physicists today aren’t even in the room, because who wants to spend their time sitting in a windowless room with a drobox when they can instead sit in a windowless room with a laser? No, we’re now the ones putting a question into the drop box, so to speak, by feeding it into a computer. The computer crunches the numbers and returns an answer. Do we understand those answers? Have we gone too far with shut-up-and-calculate? Is the room even there when no one looks? Those are all very interesting questions, but let’s not get carried away. We were trying to talk about chatbots, so let’s have a look at those.

Today’s language generating models are somewhat more sophisticated than just lookup tables like Searle imagined. And what better way is there to explain how they work than asking one itself.

“Language generating models, like me, are built using deep learning techniques, specifically a type of neural network. These models are trained on large amounts of text data, such as books, articles, and websites, and learn to generate language by identifying patterns and relationships between words and phrases.

When generating language, the model takes an initial input, such as a prompt or a question, and uses the patterns it has learned to generate a response. The generated text is not simply copied from the training data, but rather the model uses the patterns it has learned to create new, original text.”

Well that was not awkward at all, but, yes, neural networks indeed learn similar to how humans learn. They don’t just memorize input, they identify patterns and extrapolate them.

They still have many differences to the human brain, at least at the moment. Most importantly, the “neurons” in a neural network are themselves part of the algorithm and not physical, as they are in the human brain. And the human brain has a lot more structure with parts specialized for particular purposes. But neural networks do capture some aspects of how humans learn.

And that brings us to the first important point when it comes to the question of understanding. Suppose you have children in elementary school and have them memorize the multiplication tables up to ten. If you want to test whether they understood multiplication, you ask them something that wasn’t on the tables. We want to test whether they have identified the pattern and can use it on something else.

If you’re in the Chinese room with a long list of examples, you can’t answer a question that isn’t on the list. This is indeed not what anyone means by understanding, so I’d say Searle is right on that account. But this is not what neural networks do. Neural networks do instead exactly what we mean by “understanding” when we apply it to humans. They extract the pattern and apply it to something they haven’t seen before.

But this brings up another question. How do you know that that’s what it’s doing? If you ask a child to multiply two numbers, how do you know they haven’t just memorized the result? Well, you don’t.

If you want to know whether someone or some thing understands, looking at the input and output isn’t enough. You could always produce the output by a lookup table rather than with a system that has learned to identify patterns. And you can well understand something without producing any output, like you might understand this video without any output, other than maybe the occasional frown.

I’d therefore say that what we mean by “understanding something” is the ability to create a useful model of the thing we’re trying to understand. The model is something I have in my head that I can ask questions about the real thing. And that it’s useful means it has to be reasonably correct. It captures at least some properties of the real thing. In mathematical terms you might say there’s an isomorphism, a one-to-one map between the model and the real thing.

I have a “model” for example for cows. Cows stand on meadows, have four legs, and sometimes go “moo”. If you pull in the right place, milk comes out. Not a particularly sophisticated model, I admit, but I’ll work on it once cows start watching YouTube.

Understanding, then, is something that happens inside a system. You can probe parts of this understanding with input-output tests, but that alone can’t settle the question. When we’re talking about neural networks, however, we actually know they’re not lookup tables because we’ve programmed them and trained them. So we can be pretty sure they actually must have a model of the thing they’ve been trained for, somewhere in their neural weights. In fact we can be more confident that neural nets understand something than your average first grader, because for all we can tell, the first graders just ask a chatbot.

Let’s then look at the question of who understands what and why. We have a model of the human body in our brain. This allows us to understand what effects our movements will have, how humans move in general, and which parts belong where. We notice immediately if something is off.

But if you train an AI on 2 dimensional images, it doesn’t automatically map those images onto a 3d model. This is why it’ll sometimes create weird things like people with half a leg or three arms or something like that. This for example is midjourney trying to show a person tying their shoe laces. They look kind of right, because it’s what the AI was trained to do, to produce an image that looks kind of right. But they don’t actually capture the real thing.

If you take understanding to mean that it has a model of what’s going on, then these AIs almost certainly understand the relation between shadows and lights. But does it know that shadows and light are created by electromagnetic radiation bouncing off or being absorbed by three dimensional bodies? It can’t, because it never got that information.

You can instead give an AI a 3D model and train it to match images to that 3D model. This is basically how deepfakes work. And in this case, I’d say that the AI actually does partly understand the motion of certain body parts.

The issue with chatbots is more complicated because language is much more loosely tied to reality than videos or photographs. Language is a method that humans have invented to exchange information about these models that we have in our own heads. Written language is moreover a reduced version of spoken language. It does capture some essence of reality in relations between words. And if you train a neural network on that, it’ll learn those relations. But a lot of information will be missing.

Take the sentence “What goes up must come down.” That’s, for reasonably common initial conditions, a statement about Newton’s law of gravity. Further text analysis might tell you that by “down” we mean towards the ground, and that the ground is a planet called earth which is a sphere and so on. From that alone, you may have no idea what any of these words mean, but you know how they are related. And indeed, if you ask Chat GPT what happens when you throw a stone into the air, it’ll tell you the blandly obvious in several flawlessly correct paragraphs.

But a language model can’t do more than try to infer relations between words because it didn’t get any other data. This is why Chat GPT is ridiculously bad at anything that requires, for example, understanding spatial relationships, like latitude. I asked it whether “Windsor, UK, is further North or South than Toronto, Canada”. And it told me:

“Windsor is located at approximately 51.5 degrees North latitude, while Toronto is located at approximately 43.7 degrees North latitude. Therefore, Toronto is further north than Windsor.”

It’ll quote the latitudes correctly but draw the exactly wrong conclusion. It’s a funny mistake because it’d be easy to fix by equipping it with a three-dimensional model of planet earth. But it doesn’t have such a model. It only knows relations between words.

For the same reason, Chat GPT has some rather elementary misunderstandings about quantum mechanics. But let me ask you first.

Imagine you have two entangled particles, and you separate them. One goes left and the other one goes right, but like couples after a fight they’re still linked, whether they want to, or not. That they are entangled means that they share a measurable property, but you don’t know which particle has which share. It could be for example that they each either have spin plus or minus one and the spin has to add up to zero. If you measure them, either the one going left has spin plus one and the one going right minus one or the other way round. And if you measure one particle, you know immediately what the spin of the other particle is.

But let’s say you don’t measure them right away. Instead, you first perform an operation on one of the particles. This is physics so when I say operation I don’t mean heart surgery, but something a little more sophisticated, for example you flip its spin. Such an operation is not a measurement because it doesn’t allow you to determine what the spin is. If you do this, what happens to the other particle?

If you don’t know the answer, that’s perfectly fine because you can’t answer the question from what I told you. The correct answer is that nothing happens to the other particle. This is obvious if you know how the mathematics works because if you flip the spin that operation only acts on one side. But it’s not obvious from a verbal description of quantum mechanics, which is why it’s a common confusion in the popular science press. Because of that, it’s a confusion that Chat GPT is likely to have. And indeed when I asked it that question, it got it wrong. So I would recommend you don’t trust Chat GPT on quantum mechanics until it speaks fluent latex.

But ask it any word-related question and it shines. One of the best uses for Chat GPT that I have found is English grammar or word-use questions. As I was working on this video, I was wondering for example whether “drop box” is actually a word, or just the name of an app. How am I supposed to know? I’ve never heard anyone use the word for anything besides the app.

If you type this question into your search engine of choice, the only thing you get is a gazillion hits explaining how drop box, the app, works. Ask the question to Chat GPT and it’ll tell you that, yes, “drop box” is a word that English native speakers will understand.

For the same reason Chat GPT is really good at listing pros and cons for certain arguments, because those are words which stand in relation to the question. It’s also good at finding technical terms and keywords from rather vague verbal descriptions.

For example, I asked it “What’s the name for this effect where things get shorter when you move at high speed?” It explained “The name of the effect you are referring to is "length contraction" or "Lorentz contraction." It is a consequence of the theory of special relativity”. Which is perfectly correct.

But don’t ask it how English words are pronounced. It makes even more mistakes than I do.

What does this tell us about whether we understand quantum mechanics? I have argued that understanding cannot be inferred from the relation between input and output alone. The relevant question is instead whether a system has a model of what it’s trying to understand, a model that it can use to explain what’s going on.

And I’d say this is definitely the case for physicists who use quantum mechanics. I have a “model” inside my head for how quantum mechanics works. It’s a set of equations that I have used many times, that I know how to apply and use to answer questions, and I am sure the same is the case for other physicists.

The problem with quantum mechanics is that those equations do not correspond to words we use in everyday language. Most of the problems we see with “understanding quantum mechanics” come from the impossibility of expressing the equations in words. At least in English. For all I know you can do it in Chinese. Maybe that explains why the Chinese are so good with quantum technologies.

It is of course possible to just convert equations into words, by reading them out, but we normally don’t do that. What we do in science communication is kind of a mixture, with metaphors and attempts to explain some of the maths. And that conveys some aspects of how the equations work, but if you take the words too literally they stop making sense.

But equations aren’t necessary for understanding. You can also gain understanding of quantum mechanics by games or apps that visualize the behaviour of the equations, like those that I talked about in an earlier video. That, too, will allow you to build a model inside your head for how quantum mechanics works.

This is why I would also say that if we use computer simulations and visualizations in science, especially for complex problems, that doesn’t mean we’ve given up on understanding. Visualizing the behaviour of a system and probing it and seeing what it does is another way of building a model in your head.

There is another reason why physicists say they don’t understand quantum mechanics which is that it’s internally inconsistent. I have talked about this a few times before and it’s somewhat off-topic here, so I don’t want to get into this again. Let me just say that there are problems with quantum mechanics that go beyond the difficulty of expressing it in words.

So where will the AI boom lead us? First of all, it’s rather foreseeable that before long we’ll all have a personalized AI that’ll offer anything from financial advice to relationship counselling. The more you can afford to pay, the better it’ll be, and the free version will suggest you marry the prince of Nigeria.

Of course people are going to complain it’ll destroy the world and all but it’ll happen anyway because when has the risk of destroying the world ever stopped us from doing anything if there was money to make with it. The best and biggest AIs will be those of big companies and governments and that’s almost guaranteed to increase wealth disparities.

We’re also going to see YouTube flooded by human avatars and other funky AI generated visuals. Because it’s much faster and cheaper than getting a human to read text or go out and film that old-fashioned thing called reality. But I don’t think this trend will last long because it’ll be extremely difficult to make money with. The easier it becomes to create artificial footage, the more people will look for authenticity. So that stupid German accent might eventually actually be good for something. If nothing else, it makes me difficult to simulate.

Will AI eventually become conscious? Of course. There’s nothing magic about the human brain, it’s just a lot of connections that process a lot of information. If we can be conscious, computers can do it too, and it will happen, eventually.

How will we know? Like understanding, you can’t probe consciousness just by observing what goes in and comes out. If you’d really want to know, you’d have to look what’s going on inside. And at the moment that wouldn’t help because we don’t know how to identify consciousness in any case. Basically, we can’t answer the question.

But personally, I find this extremely interesting because we’re about to create an intelligent species that’ll be very different from our own. And if we’re dumb enough to cause our own extinction this way then I guess that’s what we deserve. Meanwhile, enjoy the ride.

Files

I believe chatbots understand part of what they say. Let me explain.

Try out my quantum mechanics course (and many others on math and science) on Brilliant using the link https://brilliant.org/sabine. You can get started for free, and the first 200 will get 20% off the annual premium subscription. I used to think that today's so-called "artificial intelligences" are actually pretty dumb. But I've recently changed my mind. In this video I want to explain why I think that they do understand some of what they do, if not very much. And since I was already freely speculating, I have added some thoughts about how the situation with AIs is going to develop. 💌 Support us on Donatebox ➜ https://donorbox.org/swtg 👉 Transcript and References on Patreon ➜ https://www.patreon.com/Sabine 📩 Sign up for my weekly science newsletter. It's free! ➜ https://sabinehossenfelder.com/newsletter/ 🔗 Join this channel to get access to perks ➜ https://www.youtube.com/channel/UC1yNl2E66ZzKApQdRuTQ4tw/join 00:00 Intro 01:15 The Chinese Room 03:05 The Quantum Room 04:14 How Do Chatbots Learn? 07:15 What Does "Understanding" Mean? 15:46 Do We "Understand" Quantum Mechanics? 18:21 Where Will The AI Boom Lead Us? 20:30 Check Out My Quantum Mechanics Course #science #philosophy