Machine Learning Lunch Meeting

Do LLMs solve novel tasks? An empirical investigation of out-of-distribution generalization

Date

Friday, October 11, 2024

Time

12:30-1:30 p.m.

Location

Description

Everyone is invited to the weekly Machine Learning Lunch Meetings held Fridays 12:30-1:30pm. Faculty members from Computer Sciences, Statistics, ECE, and other departments will discuss their latest groundbreaking research in machine learning. This is an opportunity to network with faculty and fellow researchers, and to learn about the cutting-edge research being conducted at our university. Please see https://sites.google.com/view/wiscmllm/home for more information.

Speaker: Yiqiao Zhong (STAT)

Abstract: Large language models (LLMs) such as GPT-4 sometimes appeared to be creative, solving novel tasks with a few demonstrations in the prompt. These tasks require the pre-trained models to generalize on distributions different from those from training data---which is known as out-of-distribution (OOD) generalization. For example, in symbolized language reasoning where names/labels are replaced by arbitrary symbols, yet the model can infer the names/labels without any finetuning.

In this talk, I will offer some new angles for understanding the emergent phenomena in LLMs, which hopefully provide empirical foundations for statistical theory for LLMs. By focusing on induction heads, which are a type of pervasive components within LLMs, I will show that learning the right compositional structure is a key to OOD generalization, and this learning process exhibits sharp transitions in training dynamics. Further, I propose the "common bridge representation hypothesis" as a compositional mechanism in Transformers, where a latent subspace in the embedding space acts as a bridge to align multiple attention heads across early and later layers.

Cost

Free

Calendar

Click a date to see events on that day.

		June
S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Machine Learning Lunch Meeting

Tags

Calendar

Search

Categories

Browse events by tag

Machine Learning Lunch Meeting

Event Details

Tags

Calendar

View events by date

Search

Search for events

Categories

Browse events by tag