Skip to main content

Machine Learning Lunch Meeting

PAL: Sample-Efficient Personalized Reward Learning for Pluralistic Alignment

Event Details

Date
Friday, November 8, 2024
Time
12:30-1:30 p.m.
Location
Description

**Today's MLLM will be held virtually on zoom only** Join with this link: zoom

Everyone is invited to the weekly Machine Learning Lunch Meetings held Fridays 12:30-1:30pm. Faculty members from Computer Sciences, Statistics, ECE, and other departments will discuss their latest groundbreaking research in machine learning. This is an opportunity to network with faculty and fellow researchers, and to learn about the cutting-edge research being conducted at our university. Please see website for more information.

Speaker: Ramya Vinayak

Abstract: Large pre-trained models trained on internet-scale data are often not ready for safe deployment out-of-the-box. They are heavily fine-tuned and aligned using large quantities of human preference data, usually elicited using pairwise comparisons. While aligning an AI/ML model to human preferences or values, it is important to ask whose preference and values we are aligning it to? The current approaches of alignment are severely limited due to the models inherently assume uniformity. In this talk, I will present, PAL, a personalize-able reward modelling framework for pluralistic alignment which incorporates diverse preferences from the ground up. PAL has modular design that leverages commonalities across users while catering to individual personalization, enabling efficient few-shot generalization. PAL is versatile to be applied to various domains, which we demonstrate via empirical validation where PAL matches or outperforms state-of-the-art methods on both text-to-text and text-to-image tasks with 100x fewer parameters. We also provide theoretical results on per user sample complexity for generalization for new preference predictions for users in the dataset as well as few-shot generalization for new users (those who did not have any preference feedback data in the dataset).

Based on work with Daiwei Chen, Yi Chen, Aniket Rege, and Zhi Wang. Reference: PAL: Pluralistic Alignment Framework for learning from heterogeneous preferences  (preprint, 2024)

Cost
Free

Tags