"Intro to Large Language Models" - Andrej Karpathy's Tech Talk Learning

Large Language Model (LLM) Talk

Andrej Karpathy's talk, "Intro to Large Language Models," demystifies LLMs by portraying them as systems with two key components:a parameters file (the weights of the neural network) anda run file (the code that runs the network). The creation of these files starts with a computationally intensive training process, where a large amount of internet text is compressed into the model's parameters. The scaling laws show that LLM performance depends on the number of parameters and the amount of training data.Karpathy reviews how LLMs are evolving to incorporate external tools and multiple modalities. He presents his view of LLMs as the kernel process of an emerging operating system and also discusses the security challenges of LLMs, including jailbreak attacks, prompt injection attacks, and data poisoning.

To listen to explicit episodes, sign in.

Stay up to date with this show

Sign in or sign up to follow shows, save episodes and get the latest updates.

Select a country or region

Africa, Middle East, and India

Asia Pacific

Europe

Latin America and the Caribbean

The United States and Canada