Skip to main content

Covid-19 Notice

All campus events (including Division of Extension sponsored events outside of Dane County) are canceled through at least May 15, with limited exceptions to be granted by deans or vice chancellors. Even if an event is not yet labeled as canceled, it's likely to be canceled, postponed or modified to online only, from now until May 15. Please check with organizers before attending.

Seminar: Accelerating Machine Learning Computations with Automated Discovery of System Optimizations

Zhihao Jia: Ph.D. candidate in the Computer Science department at Stanford University

Event Details

Thursday, January 30, 2020
4-5 p.m.

Abstract: The best strategy for deploying a machine learning (ML) application depends on both the ML model and the hardware platform. In this talk, I will describe my work on accelerating ML computations by automatically discovering model- and hardware-specific optimizations over large search spaces.

TASO, the Tensor Algebra SuperOptimizer, optimizes the computation graphs of deep neural networks (DNNs) by automatically generating potential graph optimizations and formally verifying their correctness. TASO outperforms rule-based graph optimizers in existing ML systems (e.g., TensorFlow, TensorRT, and TVM) by up to 3x by automatically discovering novel graph optimizations, while also requiring significantly less human effort.

FlexFlow is a system for accelerating distributed DNN training. FlexFlow identifies parallelization dimensions not considered in existing ML systems (e.g., TensorFlow and PyTorch) and automatically discovers fast parallelization strategies for a specific parallel machine. Companies and national labs are using FlexFlow to train production ML models that do not scale well in current ML systems, achieving over 10x performance improvement.

I will also outline future research directions for building fully automated ML systems, such as codesigning ML models, software systems, and hardware backends for end-to-end ML deployment.

Bio: Zhihao Jia is a Ph.D. candidate in the Computer Science department at Stanford University working with Alex Aiken and Matei Zaharia. His research interests lie in the intersection of computer systems and machine learning, with a focus on building efficient, scalable, and high-performance systems for machine learning. His research is in active use by multiple companies and national labs to accelerate ML applications in production. He has also designed new execution models for graph analytics applications, and a path-based distributed data transfer system for the Legion programming system.