Norman P. Jouppi (Google): A Domain-Specific TPU Supercomputer for Training Deep Neural Networks
Virtual Computer Architecture Seminar
The codesign of an ML-specific programming system (TensorFlow), compiler (XLA), architecture (TPU), floating-point arithmetic (Brain float16), interconnect (ICI), chip (TPUv2/v3), and datacenter together enable production ML applications to scale at 96%–99% of perfect linear speedup on a 1024-chip system and provide 10x gains in performance/Watt over the most efficient general-purpose supercomputers. This codesign builds on more than 50 years of research and practice in computer systems, interpreted for an era where lithographic scaling is still continuing but Dennard scaling has effectively ended.
Norman P. Jouppi is a Distinguished Hardware Engineer at Google. Norm received his Ph.D. in electrical engineering from Stanford University in 1984. He has been the principal architect and lead designer of multiple microprocessors, and is the lead architect of Google’s TPU systems. He holds more than 125 U.S. patents and has published over 125 technical papers. He is a recipient of the IEEE Harry H. Goode Award and the ACM/IEEE Eckert-Mauchly Award. He is a Fellow of the ACM, IEEE, AAAS, and member of the National Academy of Engineering.