Skip to main content

MadS&P seminar - Guest Speaker: Understanding Poisoning Attacks against Text-to-Image Generative Models

Event Details

Date
Tuesday, January 28, 2025
Time
4 p.m.
Location
3310 Computer Sciences, Computer Sciences
Description

Talk abstract:
Many believe that large text-to-image generative models are naturally robust to data poisoning due to large amounts of training data. Yet recent studies show that these models are surprisingly vulnerable to a variety of poisoning attacks, including Nightshade, a clean-label prompt-specific poisoning attack. Furthermore, concurrent poisoning attacks can induce “model implosion,” where the model becomes unable to produce meaningful images.In this talk, I will describe our recent effort to analytically understand the robustness of image generative models to poisoning attacks, by modeling and analyzing the behavior of the cross-attention mechanism in these models. We model cross-attention training as an abstract problem of “supervised graph alignment” and formally quantify the impact of training data by the hardness of alignment. We validate our analytical framework through extensive experiments,  confirming and explaining the unexpected effect of model implosion. We hope our analysis can help advance research on poisoning attacks against diffusion models and their defenses.

Bio:
Wenxin Ding is a fourth year CS PhD student at the University of Chicago advised by Professor Heather Zheng and Ben Zhao. She received the University of Chicago Eckhardt Scholarship. Her research is on adversarial machine learning, where she focuses on bridging the gap between theoretical understanding and empirical practice. Her recent works study safety limitations of generative models and help protect human creatives against intrusive model training. She has published in CCS, S&P, SaTML and Neurips.

Cost
Free

Tags