Low-Rank Fine Tuning: Learning Dynamics and a (Moderately) Improved Algorithm
Professor Yudong Chen (Computer Sciences) at Machine Learning Lunch Meetings
Event Details
We consider Low-Rank Adaptation (LoRA) for parameter-efficient fine-tuning of large language models. We characterize the learning dynamics of LoRA and in particular prove that its iterates align with the singular subspace of the one-step gradient from full fine-tuning. Inspired by our theory, we propose an improved algorithm LoRA-One, which uses a spectral initialization strategy to achieves optimal subspace alignment from the outset by computing the full gradient only once. LoRA-One can be further accelerated on ill-conditioned problems when incorporating a pre-conditioned update. Our algorithm delivers performance gains over existing LoRA methods on various benchmarks including MetaMathQA. This talk is based on https://arxiv.org/abs/2502.01235
(This talk is part of the weekly Machine Learning Lunch Meetings (MLLM), held every Tuesday from 12:15 to 1:15 p.m. Professors from Computer Sciences, Statistics, ECE, the iSchool, and other departments will discuss their latest research in machine learning, covering both theory and applications. This is a great opportunity to network with faculty and fellow researchers, learn about cutting-edge research at our university, and foster new collaborations. For the talk schedule, please visit https://sites.google.com/view/wiscmllm/home. To receive future weekly talk announcements, please subscribe to our UW Google Group at https://groups.google.com/u/1/a/g-groups.wisc.edu/g/mllm.)
We value inclusion and access for all participants and are pleased to provide reasonable accommodations for this event. Please email jerryzhu@cs.wisc.edu to make a disability-related accommodation request. Reasonable effort will be made to support your request.