What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata

TLDR;

This lecture provides an overview of generative AI, focusing on text-based models like ChatGPT. It explains the core technology behind these models, their development, capabilities, and limitations. The lecture also touches on the ethical and societal implications of large language models, including potential job displacement, the spread of misinformation, and environmental concerns. It concludes by emphasizing the importance of regulation and responsible development to mitigate risks.

Generative AI is not new, examples include Google Translate and Siri.
Large language models (LLMs) are based on language modeling, predicting the next word in a sequence.
Training LLMs involves using vast amounts of data from the internet and fine-tuning the models for specific tasks.
Scaling up LLMs improves their performance but also increases costs and energy consumption.
Ethical considerations include ensuring LLMs are helpful, honest, and harmless.

Intro [0:00]

The lecture begins by defining generative artificial intelligence as computer programs that can perform tasks typically done by humans and create new content. This content can include audio, computer code, images, text, or video. The lecture will primarily focus on text generation due to the speaker's expertise in natural language processing. The lecture is structured into three parts: the past, present, and future of AI.

Generative AI isn’t new – so what’s changed? [2:38]

Generative AI has been around for a while, with examples like Google Translate (launched in 2006) and Siri (launched in 2011). These tools use AI to generate new content, such as translating text or responding to voice commands. Sentence completion in email and search suggestions are also examples of language modeling, where the system predicts the next words. The difference now is the sophistication of new models like GPT-4, which can pass standardized tests, write code, and create website content. GPT-4's rapid adoption, reaching 100 million users in two months, highlights the significant change in AI capabilities and public interest.

How did we get to ChatGPT? [8:43]

The lecture addresses how single-purpose systems like Google Translate evolved into more sophisticated models like ChatGPT. The core technology behind ChatGPT is language modeling, which involves predicting the next word in a sequence. The lecture also touches on the risks associated with these models and offers a glimpse into the future of AI, reassuring the audience that there is no need to worry.

How are Large Language Models created? [12:38]

Building a language model involves several steps. First, a large corpus of text data is collected from sources like Wikipedia, Stack Overflow, and social media. Then, a language model, typically a neural network, is used to predict the next word in a sentence. The model is trained by removing parts of sentences and having the model predict the missing words. The model adjusts its parameters based on the accuracy of its predictions. Simple language models consist of input nodes, hidden layers, and output nodes, with connections between nodes represented by weights. More complex models like transformers, used in ChatGPT, consist of multiple blocks of neural networks stacked together. These models use self-supervised learning, where they learn from the data itself by predicting missing words. The process involves pre-training the model on a large dataset and then fine-tuning it for specific tasks.

How good can a LLM become? [22:48]

The size of a language model is a critical factor in its performance. Since 2018, there has been an extreme increase in model sizes, with GPT-4 having one trillion parameters. The number of words processed during training is also important, but the focus has been more on parameter size. GPT-4 has seen a few billion words, approaching the amount of human-written text available. Training GPT-4 cost $100 million, making it a significant investment.

Unexpected effects of scaling up LLMs [26:57]

Scaling up language models allows them to perform more tasks. As the number of parameters increases, the model can handle more complex tasks such as summarization, question answering, and translation. Simple tasks include code completion, while more complex tasks include reading comprehension, language understanding, and translation.

How can ChatGPT meet the needs of humans? [28:05]

To align language models with human needs, fine-tuning is essential. This involves collecting instructions and examples of what people want the model to do, such as answering questions step by step. The model is then trained on these examples to generalize to unseen tasks. Ensuring that AI systems behave in accordance with human desires is crucial, with the goal of creating models that are helpful, honest, and harmless (HHH). Achieving this requires humans to provide preferences and feedback to train the model, making the process more expensive.

Chat GPT demo [32:30]

The lecture includes a live demonstration of GPT, where the speaker asks various questions and requests. The model's responses are analyzed for accuracy, length, and helpfulness. The demonstration highlights the model's ability to generate poems, answer questions, and explain jokes, but also reveals its limitations, such as providing outdated information and not always following instructions.

Are Language Models always right or fair? [38:07]

It is virtually impossible to regulate the content that language models are exposed to, leading to potential biases and undesirable behavior. Examples include Google's Bard providing inaccurate information about the James Webb Space Telescope, which cost Google's parent company Alphabet $100 billion. ChatGPT also exhibits biases, such as refusing to tell jokes about women while providing jokes about men.

The impact of LLMs on society [40:21]

The societal impact of large language models includes environmental concerns, job displacement, and the creation of fake content. ChatGPT queries require significantly more energy than Google search queries. Training large models like Llama 2 produces substantial amounts of CO2. Some jobs, particularly those involving repetitive text writing, are at risk. Language models can also be used to create fake news and deep fakes, such as a song falsely attributed to Drake and The Weeknd, and a deep fake image showing Trump being arrested.

Is AI going to kill us all? [42:54]

The lecture concludes by addressing concerns about the future of AI. It emphasizes that we cannot predict what a super-intelligent AI would look like. While many intelligent AIs will be beneficial, some may be used for harm. Mitigating the risks associated with these tools is more feasible than preventing their existence altogether. The lecture references a survey by the Australian Research Council, which found that ChatGPT-4 cannot autonomously replicate or acquire resources. The speaker argues that climate change is a more immediate threat to mankind than AI. Regulation is coming to manage the risks associated with AI.

Watch the Video

Date: 7/29/2025 Source: www.youtube.com