Skip to main content

Machine Learning Lunch Meeting

Data and Compute-Efficient Foundation Model Adaptation

Event Details

Date
Thursday, May 2, 2024
Time
1 p.m.
Location
Description

Everyone is invited to the weekly machine learning lunch meetings, where our faculty members from Computer Science, Statistics, ECE, and other departments will discuss their latest groundbreaking research in machine learning. This is an opportunity to network with faculty and fellow researchers while learning about the cutting-edge research being conducted at our university. See https://sites.google.com/view/wiscmllm/home for more information.

Speaker: Fred Sala (CS)

Abstract: The promise of foundation models is the ability to conveniently be used as a base for diverse applications. Paradoxically, it turns out that adapting these models for downstream tasks can be difficult and expensive, mirroring traditional machine learning. In this talk, I will describe my group's work on addressing this challenge via efficient adaptation. First, I will describe a general approach to improving data efficiency for pretraining or fine-tuning language models by tracking ordered skills. Next, a popular technique for obtaining annotated datasets is to integrate a more powerful model (i.e., GPT4) as an automated annotator. We propose an alternative that can be orders of magnitude cheaper and is easier to audit. Finally, we consider two goals for foundation models, task-specific fine-tuning and alignment. When fine-tuning for a target downstream task, we propose new architectural innovations that lead to greater efficiency, including a generalization of the popular low-rank adapters (LoRA). When aligning a model, we propose a self-guided form of representation engineering that can be used to warm-start the often heavyweight alignment process. 

Cost
Free

Tags