Talk: Hidden Capabilities and Counterintuitive Limits in Large Language Models

Peter West: PhD Candidate, Paul G. Allen School of Computer Science & Engineering, University of Washington

Date

Thursday, March 21, 2024

Time

12-1 p.m.

Location

1240 Computer Sciences

Description

LIVE STREAM: https://uwmadison.zoom.us/j/93731028763?pwd=L0NPV2V1d3MzaXAvSWZuaFpqZmJ4dz09

Abstract: Massive scale has been a recent winning recipe in natural language processing and AI, with extreme-scale language models like GPT-4 receiving most attention. This is in spite of staggering energy and monetary costs, and further, the continuing struggle of even the largest models with concepts such as compositional problem solving and linguistic ambiguity. In this talk, I will propose my vision for a research landscape where compact language models share the forefront with extreme scale models, working in concert with many pieces besides scale, such as algorithms, knowledge, information theory, and more.

The first part of my talk will cover alternative ingredients to scale, including (1) an inference-time algorithm that combines language models with elements of discrete search and information theory and (2) a method for transferring useful knowledge from extreme-scale to compact language models with synthetically generated data. Next, I will discuss counterintuitive disparities in the capabilities of even extreme-scale models, which can meet or exceed human performance in some complex tasks while trailing behind humans in what seem to be much simpler tasks. Finally, I will discuss implications and next steps in scale-alternative methods.

Bio: Peter West is a PhD candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, working with Yejin Choi. His research is focused on natural language processing and language models, particularly combining language models with elements of knowledge, search algorithms, and information theory to equip compact models with new capabilities. In parallel, he studies the limits that even extreme-scale models have yet to solve. His work has received multiple awards, including best methods paper at NAACL 2022, and outstanding paper awards at ACL and EMNLP in 2023. His work has been supported in part by the NSERC PGS-D fellowship. Previously, Peter received a BSc in computer science from the University of British Columbia.

Cost

Free

Calendar

Click a date to see events on that day.

		April
S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Talk: Hidden Capabilities and Counterintuitive Limits in Large Language Models

Tags

Calendar

Search

Categories

Browse events by tag

Talk: Hidden Capabilities and Counterintuitive Limits in Large Language Models

Event Details

Tags

Calendar

View events by date

Search

Search for events

Categories

Browse events by tag