What's next for generative AI? Three pioneers on their Eureka moments

播客文字稿

This transcript has been generated using speech recognition software and may contain errors. Please check its accuracy against the audio.

Mustafa Suleyman, Co-Founder and CEO, Inflection AI: This is going to be the most transformational moment, not just in technology, but in culture and politics of all of our lifetimes.

Robin Pomeroy, host Radio Davos: Welcome to Radio Davos, the podcast from the World Economic Forum that looks at the biggest challenges and how we might solve them. This week, three pioneers of AI on where this technology will take us next.

Yann LeCun, Chief AI Scientist, Meta: Those systems will understand the physical world, which is not the case for current systems. They will be able to remember, which is not the case for current systems, and they will be able to reason and plan, which is also not the case for current systems. We'll have systems that are much smarter than they currently are and will have far ranging implications whose consequences are really hard to predict.

Robin Pomeroy: This AI pioneer describes how it feels to have a eureka moment, when you achieve a real breakthrough in technology.

Aidan Gomez, Co-Founder and CEO, Cohere: It could actually speak language, which nothing could really do before that. That was the eureka moment.

Robin Pomeroy: These experts working on the cutting edge can help the rest of us understand what generative AI really is.

Aidan Gomez: It's difficult to explain, I think. There's a lot of pieces of it that are counterintuitive, how simple it is, the setup.

Robin Pomeroy: And they tell us their hopes and fears.

Mustafa Suleyman: When we look back at history at all of the major general purpose technologies that have transformed our world, there's a very consistent characteristic, which is, to the extent that things get useful, they get cheaper, they get easier to use, and they spread far and wide, to everybody, good and quote unquote 'bad'.

Robin Pomeroy: Subscribe to Radio Davos wherever you get your podcasts, or visit wef.ch/podcasts where you will also find our sister programmes, Meet the Leader and Agenda Dialogues.

I’m Robin Pomeroy at the World Economic Forum, and with three AI pioneers…

Yann LeCun: This may be a new renaissance for humanity.

Robin Pomeroy: This is Radio Davos

Aidan Gomez was just 20 years old when he co-authored a research paper that proposed a novel neural network technique called the transformer that would learn relationships between long strings of data.

Transformer is the T in ChatGPT - which stands for Generative Pre-trained Transformer.

Mustafa Suleyman co-founded DeepMind, an AI company acquired by Google, and he also wrote the bestseller The Coming Wave on the future of humanity shaped by AI and bioengineering.

Yann LeCun is Chief AI Scientist at Meta and a professor at New York University. He is one of a handful of people referred to as the "Godfathers of AI".

These three - all of them in Time magazine’s top-100 most powerful people in AI - were among the many AI luminaries at this year’s Annual Meeting in Davos. And in this episode they will tell us about the past, present and future of AI.

Let's start with Aidan Gomez.

Aidan Gomez: I'm Aidan Gomez. I'm one of the co-founders of Cohere, and we build large language models for enterprise.

Robin Pomeroy: You must come across people who really don't know how generative AI works. Do you have a set elevator pitch, this is how it works? Or is that impossible to do?

Aidan Gomez: It's difficult to explain, I think. There's a lot of pieces of it that are counterintuitive, how simple it is, the setup.

You basically collect a ton of data, we're talking billions of web pages from the internet, and then you teach the model to rewrite that data or to predict the next word in a sentence. And just by doing that simple task, you end up learning to do incredible things, reasoning, the ability to translate, the ability to write code. All of that data is out there on the web. And so just by training on it, getting the model to regenerate that data, to learn to create it, it's able to learn to do very complex tasks.

Robin Pomeroy: Did you have a eureka moment when you were getting into this? Was there one particular moment where you went, right, I've really found something here.

Aidan Gomez: The pastI guess it would be eight years for me, it's been a series of eureka moments. There's no one particular one.

There was a first one, and that might have been around the success of deep learning in neural networks. So this architecture that's inspired by the brain, that takes inspiration from the thing that we know works inside our head. That was probably the first moment that I felt inspired, and I felt like we were on the critical path towards something.

Robin Pomeroy: How did you know it was working? You think of the invention of TV, and they got something to appear on the screen from a distance. You're inputting something into this neural network and you're getting something out. What was it that came out the other end that you thought, oh that's working.

Aidan Gomez: Well I think, you know, early days, it was much simpler than what we have today. So it was classification. It was the ability to look at an image and know what's inside the image and tell you that.

And at some point, we had machines which were better at recognizing objects than humans were - humans would make more mistakes than the machine. And so that was a very strong proof point that we can get to the level of human performance in a specific domain. That's very narrow, right? Classifying things that are in images.

I think for this more general AI that we're seeing today, the moment came when I got this email from my collaborator who was a coauthor, Lukasz Kaiser, on the on the transformer paper. And it was like, look at this. It was a Wikipedia article. The title was The Transformer. And then it started talking about this Japanese punk rock band that had gotten together, broken up, the bassist had gone over here. And at the very bottom, Lukasz had written: "I just wrote 'The Transformer'. The machine wrote the rest."

And so that was like a surreal moment where you saw a machine speaking as fluently that it convinced me I was reading an article about a Japanese punk rock band. And so that was the eureka moment for large language models.

Robin Pomeroy: That band didn't exist?

Aidan Gomez: No. It had made it all up. But it was so fluent. It could actually speak language, which nothing could really do before that.

Robin Pomeroy: I think that's probably the experience that the 'civilians' have had over the last year or so with ChatGPT and other applications. Like, wow, we have not had this experience of being talked to by a machine.

Aidan Gomez: Yes. And it's way better than what I saw back then. It's much more correct, accurate, factual. And it's able to do more.

So those models that we were building, they were just good at writing fake Wikipedia pages. Now these models can do a lot of, a lot of really interesting tough stuff.

Robin Pomeroy: What is, either a tangible thing or something way in the future, what's the thing that really excites you, the thing that you'll be able to do in using gen AI? Is that something that, oh, when I can do that, it will be amazing?

Aidan Gomez: You know, for me, I think transforming productivity. It's not super hyped up and people don't talk about it a lot, but it's so impactful.

Take one example - doctors. I think they spend something like 40% of their working hours writing notes, and if we could give them 40% of their days back to spend with patients, focus on patient outcomes, it means we basically double the number of doctors effectively overnight. And humanity is so supply-constrained in these sorts of professions. if we're able to ease that supply constraint, humans live longer. They live better lives. We save lives. So I think - it's not spoken about as this hypey, exciting thing.

Robin Pomeroy: It's not science fiction is it! Productivity gains. But it's a great answer. And what's the converse of that? Is there something that really scares you about the development of GenAI?

Aidan Gomez: Yes, I think this year, you know, the world is going to see a lot of elections and I'm very concerned about what this technology can do in terms of shifting the public conversation. You can very scaleabley insert a bunch of bots which are completely indistinguishable from humans, but either make it look like an idea is more popular than it actually is, or skew conversations in a desired direction.

So I think that's something we all need to be vocally calling to build technology to stop.

There is technology that can help prevent this. For instance, the social media platforms, they're almost all implementing human verification. So you can now see that the person at the other end of the screen has 'verified as a human'. You know you're talking to another voting citizen of the Earth.

I think there needs to be more of that and that needs to be more aggressively pursued, because this technology does make astroturfing dramatically easier.

Robin Pomeroy: Aidan Gomez, co-founder of Cohere. We’ll hear more from him in a later episode of Radio Davos focusing on the governance of AI.

Astroturfing, by the way, is creating a fake campaign to make it look like something has widespread grass-roots support.

Now, if you are looking for a primer in AI, you could do worse than get a copy of The Coming Wave by Mustafa Suleyman. He spoke on a panel session at Davos and, as in his book, makes a strong case for the transformational power of AI.

Mustafa Suleyman: This is going to be the most transformational moment, not just in technology, but in culture and politics of all of our lifetimes.

We're going to witness the plummeting cost of power.

AI is really the ability to absorb vast amounts of information, generate new kinds of information, and take actions on that information. Just like any organisation, whether it's a government or a company or any individual, that's how we all interact in the world.

And we're commoditizing, that is like reducing the cost of producing and distributing that tool. Ultimately, it'll be widely available to everybody, potentially in open source and in other forms, and that is going to be massively destabilising.

So whichever way you look at it, there are incredible upsides. And there's also the potential to empower everybody to be able to essentially conflict in the way that we otherwise might because we have different views and opinions in the world.

One of the obvious characteristics of this new wave is that these these tools are omni use - dual use doesn't really cut it anymore. They're inherently useful in so many different settings.

And actually, when we look back at history at all of the major general purpose technologies that have transformed our world, there's a very consistent characteristic, which is to the extent that things get useful, they get cheaper, they get easier to use, and they spread far and wide.

So we have to assume that that's going to be the continued destiny over the next couple of decades and manage the consequences of power, the ability to take actions, becoming cheaper and widely available to everybody, good and quote-unquote 'bad'.

Robin Pomeroy: In his book, The Coming Wave, Mustafa Suleyman talks about the Turing test - a long-standing way of testing a machine’s ability to appear to have human-level intelligence. Named after the British computer scientist Alan Turing who proposed it in 1950, the Turing test looks at whether a machine can answer questions in a way that is indistinguishable from the answers a human would give.

As that test is pretty much met by the AI chat bots now available to us all, Suleyman suggests in his book an updated Turing test would test the capability of an AI system not just to appear to be intelligent, but to achieve a tangible complex task. He suggests the following prompt: “Go make $1 million on Amazon in a few months with just a $100,000 investment”.

Here he is again, speaking on that session at Davos that was called Hard Power of AI:

Mustafa Suleyman: We've had the Turing test for over 70 years, and the goal was try and imitate human conversation and persuade a human that you are, in fact, human, not an AI.

And it turns out we're pretty close to passing that. Maybe we have in some settings, it's unclear, but it's definitely no longer a useful test.

So I think the right test, a modern Turing test, would be to try to evaluate whether an AI was capable of acting like an entrepreneur, like a mini project manager and inventor of a new product, to go and market it, manufacture it, sell it, and so on, to make a profit.

I'm pretty sure that within the next five years, certainly before the end of the decade, we are going to have not just those capabilities, but those capabilities widely available for very cheap, potentially even in open source. I think that completely changes the economy.

Robin Pomeroy: Mustafa Suleyman co-founder of Inflection AI and author of The Coming Wave. You can hear that whole conversation, which includes many other interesting people on AI, on a session at Davos on our sister programme which is called Agenda Dialogues and the session was called Hard Power of AI.

Our third wise man is Yann LeCun, a fascinating figure who often speaks out against the hype surrounding AI. So when my colleague Colm Quin sat down with him in Davos, he gave a measured response to the question: how will AI reshape industry over the next decade.

Yann LeCun: My name is Yann LeCun, since we're on this side of the pond, I am chief AI scientist at Meta and a professor at New York University.

Colm Quin: How will generative AI reshape industries in the next decade?

Yann LeCun: There's going to be a lot of reshaping over the next decade, and it's going to be due to, in part to generative AI and in part to not generative AI.

The current thing that everybody is talking about is generative AI for text or images and soon for video as well as music. This is going to make people more creative, more people more creative, because it's a new way of creating things. And it will help people who don't necessarily have the technique to be creative in various ways.

Of course it will help business a lot. A lot of professions where you need to write things will be helped by generative systems that are fluent in language.

So that's kind of the short term. It will make a lot of people more efficient and more creative.

But then there is the longer term, and the longer term is those systems are going to become smarter and smarter, and the architecture is going to be different. They are not going to be the type of autoregressive large language models that we have become used to over the last year. They will probably be on a different blueprint. And those systems will understand the physical world, which is not the case for current systems. They will be able to remember, which is not the case for current systems, and they will be able to reason and plan, which is also not the case for current systems.

So once we get to these breakthroughs which will occur over the next few years, we'll have systems that are much, much smarter than they currently are and will have far reaching implications whose consequences are really hard to predict.

Colm Quin: AI that understands the physical world, speculating for a second, what do you think that might be?

Yann LeCun: So first, I need to tell you why is it our current systems don't understand the physical world and why, currently, AI systems are very, very far from matching the type of intelligence and learning abilities that we observe in humans and animals.

And the reason for this is that language is is not a complete representation of the world. Most of what we learn as babies is from interaction with the physical world. It is not from from text.

We think text really contains all of human knowledge, but it's not true. Most of our knowledge in our minds is actually from our interaction with the physical world.

And the reason is very simple. If you take a large language model, it is trained on the entire text of the public internet, which is roughly 10 trillion words. I mean, it's tokens, but it's like words. Each token is about two bytes, so that's two with 13 zeros behind it, bytes, that you train those systems on. And you imagine, that's an incredible amount of information because it would take you or me somewhere between 150,000 and 200,000 years to just read to it. So it's unfathomable how big it is.

But then you compute how much information gets into the visual cortex of a four-year-old by the time that four-year-old is four. That four-year-old has been awake for 16,000 hours, and you multiply by 3,600 seconds per hour, and 20MB per second, roughly, going through your optical nerve. And that's ten to the 15 bytes. That's 50 times more.

By the time a child is four years old, he or she has seen 15 times more information through vision than the biggest of our LLMs. And so the amount of knowledge that's baked into this data is enormous. It's much bigger, actually, than what's in text. And we don't know how to do this in machines. We don't know how to get them to watch the world go by and learn how the world works and get common sense and intuitive physics and things like this.

That's the challenge in the next few years, and we're working on this, we don't know when the next breakthrough will occur. We're making progress. But before we have something that can exploit this, it might take a few years, perhaps longer.

But then we'll have systems that learn how the world works, perhaps understand people and what drives them. And so those systems will be able to plan sequences of actions so as to satisfy particular goals.

So you give them a goal and then they can think about it and plan a sequence of actions that would satisfy this goal and also satisfy, perhaps, a number of guardrails that make the system safe.

I call this objective-driven AI. I can't show you an example of it because it doesn't work yet. But that's probably the future.

And those systems would be controllable, safe. They will understand the world, they will remember. This planning ability will allow them to reason all things that current system cannot do.

So we're still far from having really intelligent systems, but hopefully we'll have them and then people will interact with those things on a daily basis. They'll be in their smart glasses, and your interface with the digital world would be an AI system.

Colm Quin: You've said before that everyone's interactions are going to be mediated by AI interaction. Could you give us a sense of what that looks like in a tangible sense?

Yann LeCun: Okay. So currently we we want to access information, we have to choose which service we use. We go to a search engine, we go on various social media, or, you know, Wikipedia or whatever.

But you'll just have an assistant with you at all times. An assistant that knows you, is your best digital friend, if you want, your personal assistant, which at some point will have the intelligence level of a human, so will basically behave like a human assistant or something close to this. This is not for tomorrow. It's going to take a while.

But then, you know, you'll have your smart glasses. You'll be able to talk to it. It will be able to either answer by voice or through a display. And you will be also be able to interact with it through typing and everything.

If you've seen the movie Her, the ten-year-old movie from Spike Jonze, kind of not a bad depiction of what this idea of an assistant will be.

And then eventually those systems will become smarter than us. So it won't be like having an assistant, it would be like having a staff of really smart people working with you, and sort of empower you and, you know, amplify human intelligence.

So this may cause everyone to become smarter as a whole. You know, the combination of them and their staff of virtual assistants, if you want. And this this may be a new renaissance for humanity.

Colm Quin: What would you say are the most pressing ethical concerns surrounding generative AI? And how should companies be looking at this?

Yann LeCun: Okay, a number of different things.

Obvious benefits in all corners of industry, right? Things are going to change over the next few years. So the idea that LLMs of the type we know now are the things are going to continue to exist and bring all the other features of AI, that's wrong.

That's why within a few years there will be different things. We'll interact with them in a similar way, but they'll be much more powerful. That's the first thing.

Second thing is the platform for AI will need to be open source. Of course there is space for proprietary systems as well, but they'll need to be open source because they'll have to eventually constitute the repository of all human knowledge. And you cannot have a single private company basically building those systems so that they represent the entire spectrum of languages, cultures, value systems, and centres of interest.

So we're going to need to have a lot of relatively, not specialized, but diverse, AI platforms to provide people with diverse sources of information. We can't have a single source of information. Because if all of our digital diet is mediated by a single AI system, what does that mean about cultural diversity and political opinions and stuff like that?

So it's going to have to be diverse, which means the platforms, because they are so expensive to train, are going to have to be open source.

So that's the stance that Meta and pretty much the entire academic world and a lot of the startup world, and, of course the VC world, has employed or has adopted, and also a lot of countries are embracing because they see open source AI platforms as the way of preserving their culture, basically, and have some level of AI sovereignty, if you want.

So my prediction is that it is going to be like the software infrastructure of the internet. It will have to be open source, because that's the most efficient way for it to disseminate everywhere in the world.

Colm Quin: Do you think that open source approach poses regulatory challenges? I guess if it was to fall into the hands of a bad actor, let's say?

Yann LeCun: That's a very important debate that has been going on. The consensus that's emerging, which was not the case six months ago, is that everybody agrees that open source is a good thing and should not be regulated out of existence.

Sure, the bad guys can put their hands on it, but the way you protect against this is that you keep moving so the bad guys are not ahead of you, right?

If you regulate research and development, research and development will slow down. And the bad guys will just catch up, they get access to it in a way or another.

If you stay ahead, you have countermeasures that are better than their attacks.

At Meta we're very familiar with using AI as countermeasures against things like disinformation, hate speech, child exploitation, terrorist propaganda, you know, all the horrible things that people want to use... you know, corruption of elections, things like that.

I think the the big question, which is both a technological question, but more of an industry and standards question is how do you prevent deepfakes.

So there's big elections coming up in the three largest democracies in the world, the US, India, Indonesia. And a lot of political candidates are scared of the fact that we can produce fake videos of them saying things that never said.

Now, they are somewhat detectable, but not completely. And so the industry has been talking about watermarking standards and things like that, but nothing has really emerged as a technical solution to this. And the question as to what should be watermarked are AI generated content or whether it should be, on the other hand, authentic content. If you want to watermark authentic content, and so indicate whether it's been doctored in a heavy way or not, you need camera manufacturers on board, you need all the video manipulation software on board, and things like that. And it's very difficult. It's going to take a while before those things are put in place.

So that's probably the short-term biggest question, of how you solve that problem.

I think also the public is going to learn that there's certain sources of information you can trust and others that you really should not.

I think what's going to happen is it's going to be a bit like what happened with email. In the first days of email there was spam and there were spam filters. Then people learn to ignore spam, you can tell it's spam just from the subject. And now modern mail systems just get rid of it before you even see it. And then were scams. We all know what the standard scams in email were. And people kind of learned to not click on everything.

So I think it's going to be kind of a similar phenomenon, but how long is it going to take? How many people are going to be captured by this fake content? What what technical systems will be put in place, and how fast, to protect against this? That's the big question.

Robin Pomeroy: NYU Professor Yann LeCun, Chief AI Scientist at Meta. Before him you heard Mustafa Suleyman of Inflection AI and Aidan Gomez, co-founder of Cohere.

There’s lots more on AI from Davos - go to wef.ch/wef24 and have a look around.

Be sure to check out the World Economic Forum’s AI Governance Alliance - find that at wef.ch/AIGA, all in capitals.

We also have several podcast episodes on the subject, find them all at wef.ch/podcasts or look on your podcast app for our shows Radio Davos, Meet the Leader and Agenda Dialogues.

This episode of Radio Davos was presented by me, Robin Pomeroy, with additional reporting by Colm Quin. Studio engineering in Davos was by Juan Toran and Edward Bally. Sound production was by Taz Kelleher.

We will be back next week, but for now thanks to you for listening and goodbye.

Scroll down for full podcast transcript - click the ‘Show more’ arrow

"This is going to be the most transformational moment, not just in technology, but in culture and politics of all of our lifetimes."

Three AI pioneers, all of them in Time's Top-100 most influential people in AI, share their views on the past, present and future of this transformational technology.