Machine Learning Lunch Meeting

How Intelligent Are Current Multimodal Video Models?

Date

Friday, December 6, 2024

Time

12:30-1:30 p.m.

Location

Description

Everyone is invited to the weekly Machine Learning Lunch Meetings held Fridays 12:30-1:30pm. Faculty members from Computer Sciences, Statistics, ECE, and other departments will discuss their latest groundbreaking research in machine learning. This is an opportunity to network with faculty and fellow researchers, and to learn about the cutting-edge research being conducted at our university. Please see our website for more information.

Speaker: Yong Jae Lee (CS)

Abstract: In this talk, I will present two recent contributions from my lab that challenge and advance the capabilities of multimodal video models. First, I will introduce a new benchmark called Vinoground, which evaluates the temporal, counterfactual reasoning capabilities of existing models. Spoiler alert: they aren't great (to put it mildly). Second, I will present a novel approach inspired by the Matryoshka doll to improve the efficiency of multimodal models. It learns to compress the total number of visual tokens in a nested fashion, significantly reducing the number of tokens that the subsequent language model needs to process. These works were led by Mu Cai and Harris Zhang. https://vinoground.github.io/ https://matryoshka-mm.github.io/

Cost

Free

Calendar

Click a date to see events on that day.

		October
S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Machine Learning Lunch Meeting

Tags

Calendar

Search

Categories

Browse events by tag

Machine Learning Lunch Meeting

Event Details

Tags

Calendar

View events by date

Search

Search for events

Categories

Browse events by tag