Stanford CS25: V4 I Jason Wei & Hyung Won Chung of OpenAI

Описание к видео Stanford CS25: V4 I Jason Wei & Hyung Won Chung of OpenAI

April 11, 2024
Speakers: Jason Wei & Hyung Won Chung, OpenAI

Intuitions on Language Models (Jason)
Jason will talk about some basic intuitions on language models, inspired by manual examination of data. First, he will discuss how one can view next word prediction as massive multi-task learning. Then, he will discuss how this framing reconciles scaling laws with emergent individual tasks. Finally, he will talk about the more general implications of these learnings. Slides here: https://docs.google.com/presentation/...

Shaping the Future of AI from the History of Transformer (Hyung Won)
Hyung Won: AI is developing at such an overwhelming pace that it is hard to keep up. Instead of spending all our energy catching up with the latest development, I argue that we should study the change itself. First step is to identify and understand the driving force behind the change. For AI, it is the exponentially cheaper compute and associated scaling. I will provide a highly-opinionated view on the early history of Transformer architectures, focusing on what motivated each development and how each became less relevant with more compute. This analysis will help us connect the past and present in a unified perspective, which in turn makes it more manageable to project where the field is heading. Slides here: https://docs.google.com/presentation/...

About the speakers:
Jason Wei is an AI researcher based in San Francisco. He is currently working at OpenAI. He was previously a research scientist at Google Brain, where he popularized key ideas in large language models such as chain-of-thought prompting, instruction tuning, and emergent phenomena.

Hyung Won Chung is a research scientist at OpenAI ChatGPT team. He has worked on various aspects of Large Language Models: pre-training, instruction fine-tuning, reinforcement learning with human feedback, reasoning, multilinguality, parallelism strategies, etc. Some of the notable work includes scaling Flan paper (Flan-T5, Flan-PaLM) and T5X, the training framework used to train the PaLM language model. Before OpenAI, he was at Google Brain and before that he received a PhD from MIT.

More about the course can be found here: https://web.stanford.edu/class/cs25/

View the entire CS25 Transformers United playlist:    • Stanford CS25 - Transformers United  

Комментарии

Информация по комментариям в разработке