Skip to main content

Looped Transformers as Programmable Computers

Machine Learning Lunch Meeting: Dimitris Papailiopoulos, Tuesday March 7, 12:30pm CS 1240

Event Details

Date
Tuesday, March 7, 2023
Time
12:30 p.m.
Location
Description

You are cordially invited to the weekly CS Machine Learning Lunch Meetings. This is a chance to get to know machine learning professors, and talk to your fellow researchers.  Our next meeting will be on Tuesday March 7 12:30-1:30pm in CS 1240. Professor Dimitris Papailiopoulos will tell us about programming GPT, see abstract below.

If you would like to be informed of future CS Machine Learning Lunch Meetings, please sign up our mailing list at https://lists.cs.wisc.edu/mailman/listinfo/mllm -- please use your cs or wisc email.  After you enter your email, the system will send you an email for confirmation.  Only after you respond to that email will you be on the mailing list.

Abstract: We demonstrate the potential of transformer networks to function as universal computers by placing them in a loop. The input sequence acts as a punchcard consisting of instructions and memory for data read/writes. We show how 13 layers of encoders can be pieced together to emulate a numerical linear algebra library, non-linear learning algorithms, and finally a general purpose, instruction-set based computer. Although these results are possible by designing the weight matrices of these networks, we show how one can implement a general purpose computer with by appropriately interconnecting a bunch of GPT3 models. Our results shed light on the mechanics of attention, the importance of loops, and the potential for using transformers as programmable compute units.

Cost
Free

Tags