Looped Transformers as Programmable Computers

Machine Learning Lunch Meeting: Dimitris Papailiopoulos, Tuesday March 7, 12:30pm CS 1240

Tuesday, March 7, 2023
12:30 p.m.

Professor Dimitris Papailiopoulos will tell us about programming GPT

Abstract: We demonstrate the potential of transformer networks to function as universal computers by placing them in a loop. The input sequence acts as a punchcard consisting of instructions and memory for data read/writes. We show how 13 layers of encoders can be pieced together to emulate a numerical linear algebra library, non-linear learning algorithms, and finally a general purpose, instruction-set based computer. Although these results are possible by designing the weight matrices of these networks, we show how one can implement a general purpose computer with by appropriately interconnecting a bunch of GPT3 models. Our results shed light on the mechanics of attention, the importance of loops, and the potential for using transformers as programmable compute units.