Skip to main content

Machine Learning Lunch Meeting

Recent Advances in Average-Reward Restless Bandits

Event Details

Date
Thursday, November 30, 2023
Time
12 p.m.
Location
Description

Everyone is invited to the weekly machine learning lunch meetings, where our faculty members from Computer Science, Statistics, ECE, and other departments will discuss their latest groundbreaking research in machine learning. This is an opportunity to network with faculty and fellow researchers and to learn about the cutting-edge research being conducted at our university.

Speaker: Qiaomin Xie

Abstract: We study the infinite-horizon restless bandit problem with the average reward criterion. A fundamental question is how to design computationally efficient policies that achieve a diminishing optimality gap as the number of arms, N, grows large. Existing results on asymptotical optimality all rely on the uniform global attractor property (UGAP), a complex and challenging-to-verify assumption. In this talk, we discuss how this assumption can be removed. We propose a general, simulation-based framework, that converts any single-armed policy into a policy for the original N-armed problem. Our framework can be instantiated to produce a policy with an $O(1/\sqrt{N})$ optimality gap. In discrete time, we assume a simpler synchronization assumption, covering non-UGAP problem instances. More notably, in continuous time, our result only requires the standard unichain condition. Our work gives the first asymptotic optimality result without UGAP.

Cost
Free

Tags