Skip to main content

Talk by Benjamin London (Amazon)

Optimizing Recommendations Using Logged Bandit Feedback

Event Details

Date
Friday, December 14, 2018
Time
12:30 p.m.
Location
3280, Orchard Room, Discovery Building
Description

Friday, December 14, 2018 - 12:30pm

WID Orchard Room 3280

Abstract:
Amazon Music is an online music streaming service available on your desktop, phone or Alexa-enabled device. We strive to create a lean-back listening experience in which the right music is delivered to each customer at the right time. Achieving this ambitious goal will require interdisciplinary innovation across multiple areas of machine learning, information retrieval, and natural language understanding. In this talk, I will give an overview of some of the projects we work on and drill down into the recommendation problem. In particular, I will discuss the problem of learning recommendation models from data collected on our visual and voice platforms. Because we can only present customers with a finite number of items, the feedback we collect on recommendations is biased by the recommendation policy (and visual layout). Since we often re-train our models on this data, not accounting for bias can lead to self-fulfilling prophecies in which what was recommended frequently before gets recommended even more frequently. What’s more, the feedback we collect is incomplete, in that we only observe feedback for the content we've recommended; and even for that content, a customer typically only interacts with a few items. This setting is sometimes referred to as learning from logged bandit feedback. I will review some recent work (including my own) on how to train recommendation policies offline on logged bandit feedback while accounting for bias and other complicating factors.

Speaker Bio:
Ben London is a Sr. Scientist at Amazon Music Machine Learning. His research focuses on machine learning theory and algorithms, with an emphasis on structured data, probabilistic graphical models, generalization error bounds, and the algorithmic stability of learning and inference. At Amazon, he has worked on multitask learning, metric learning, music recommendation and offline policy optimization. Ben earned his PhD from the University of Maryland, where he was advised by Prof. Lise Getoor.

Cost
Free

Tags