Whenever possible, the university recommends that events and meetings continue to be held virtually; it is highly recommended that in-person events also allow for virtual participation by attendees who choose not to or are unable to participate in person. Any in-person events must follow campus policy for schools/colleges/divisions, and student organizations.
Rob A. Rutenbar (U. Pittsburgh): Experiments with Bayesian Inference Accelerators
Virtual Computing, Data, & Information Sciences Seminar and Virtual Computer Architecture Seminar
Full Title: Experiments with Bayesian Inference Accelerators (or, Why AI Algorithms that are NOT Deep Neural Nets Also Want to be Silicon)
The unexpected coincidence of Moore’s Law’s last gasp, and the emergence of deep neural networks (DNNs) and their breakthrough levels of recognizer/classifier performance, has created a remarkable opportunity for new custom hardware accelerators. In hindsight, the transition of DNNs into hardware (enhanced GPU, FPGA, custom) seems inevitable: the algorithms are highly structured, the gains to be had are large, and the resulting architectures often leverage decades of DSP experience (albeit, with new focus on the unprecedented scale of the networks, and their parameter sets). But, there are actually other interesting methods in the AI toolbox. My group has spent the last few years working on Bayesian inference accelerators, for problems that can be modeled as Markov Random Fields (MRFs). MRFs are large graphs whose nodes represent what we know and believe, and whose edges represent joint probability relationships. “Inference” in this context answers questions like “what are the most likely labels on these nodes?” MRFs appear in a wide range of AI tasks, notably in vision and audio processing. I’ll show three very different classes of accelerators for MRF inference, implementing belief propagation, graph cut, and random sampling algorithms. I’ll show both FPGA and custom silicon results, including a recent 16nm programmable parallel Gibbs sampler chip done with colleagues at Harvard, that benchmarks as 1380x faster and with 1965x less energy than an Arm Cortex-A53 CPU on the same SoC, and 6.3x faster than an embedded FPGA in the same technology. This is joint work with my students Jungwook Choi (now Hanyang University), Tianqi Gao (now at Apple), and Glenn Ko (now at Harvard).
Biography: Rob Rutenbar is Senior Vice Chancellor for Research at the University of Pittsburgh, where he has responsibility for Pitt’s research and innovation infrastructure, and also holds appointments in CS and ECE. He was previously Head of the Department of Computer Science at the University of Illinois at Urbana-Champaign, and before that, on the faculty in Electrical and Computer Engineering at Carnegie Mellon. He holds a PhD from the University of Michigan.
His research focuses on tools for chip design and hardware for AI. He has mentored 50+ graduate students, received numerous awards for his work, including the IEEE CAS Industrial Pioneer Award and the Semiconductor Research Corporation Aristotle Award, and launched two successful tech startups. His Coursera courses on VLSI CAD have reached roughly 90,000 registered learners. He is a Fellow of the IEEE, ACM, AAAS, and National Academy of Inventors (NAI).