Colloquium: Why do neural networks learn?
Thursday, February 14, 2019
Neural networks used in practice have millions of parameters and yet they generalize well even when they are trained on small datasets. While there exist networks with zero training error and a large test error, the optimization algorithms used in practice magically find the networks that generalize well to test data. How can we characterize such networks? What are the properties of networks that generalize well? How do these properties ensure generalization? In this talk, we will develop techniques to understand generalization in neural networks. Towards the end, I will show how this understanding can help us design architectures and optimization algorithms with better generalization performance.