Talk: The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks
Jonathan Frankle: Ph.D. Candidate, MIT
Also offered online
Abstract: I recently proposed the lottery ticket hypothesis: that the dense neural networks we typically train have much smaller subnetworks capable of reaching full accuracy from early in training. This hypothesis raises (1) scientific questions about the nature of overparameterization in neural network optimization and (2) practical questions about our ability to accelerate training. In this talk, I will discuss established results and the latest developments in my line of work on the lottery ticket hypothesis, including the empirical evidence for these claims on small vision tasks, changes necessary to scale these ideas to practical settings, and the relationship between these subnetworks and their "stability" to the noise of stochastic gradient descent. I will also describe my vision for the future of research on this topic.
Bio: Jonathan Frankle is a fifth year PhD student at MIT, where he empirically studies deep learning with Prof. Michael Carbin. His current research focus is on the properties of sparse networks that allow them to train effectively as embodied by his "Lottery Ticket Hypothesis" (ICLR 2019 best paper award). Jonathan also has an interest in technology policy: he has worked closely with lawyers, journalists, and policymakers on topics in AI policy and has taught at the Georgetown University Law Center. He earned his BSE and MSE in computer science at Princeton and has previously spent time at Google, Facebook, and Microsoft.