Machine Learning Lunch Meeting

A Geometric Journey into the World of Large Language Models

Date

Thursday, November 2, 2023

Time

12 p.m.

Location

Description

Everyone is invited to the weekly machine learning lunch meetings, where our faculty members from Computer Science, Statistics, ECE, and other departments will discuss their latest groundbreaking research in machine learning. This is an opportunity to network with faculty and fellow researchers and to learn about the cutting-edge research being conducted at our university.

Speaker: Yiqiao Zhong

Abstract: Transformers are neural networks that underpin the recent success of large language models. They are often used as black-box models and building blocks of complex AI systems. Yet, it is unclear what information is processed through layers of a transformer, which raises the issue of interpretability. In this talk, I will present an empirical study of transformers by examining various pretrained transformer models. A surprisingly consistent geometry pattern emerges in hidden states (or intermediate-layer embeddings) across layers, models, and datasets. Our study (1) provides structural characterization of the learned weight matrices and self-attention mechanism, and (2) suggests that hidden smoothness is essential for the success of transformers.

Cost

Free

Calendar

Click a date to see events on that day.

		July
S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Machine Learning Lunch Meeting

Tags

Calendar

Search

Categories

Browse events by tag

Machine Learning Lunch Meeting

Event Details

Tags

Calendar

View events by date

Search

Search for events

Categories

Browse events by tag