AI Safety and Theoretical Computer Science
Scott Aaronson (University of Texas at Austin)
Event Details
Date
Friday, April 4, 2025
Time
11 a.m.-12 p.m.
Location
1240 Computer Sciences
Description
Progress on AI safety and alignment has been almost entirely empirical. In this talk, I'll survey a few areas where theoretical computer science can contribute to AI safety:
- How can we robustly watermark the outputs of Large Language Models and other generative AI systems, to help identify academic cheating, deepfakes, and AI-enabled fraud?
- Can one insert undetectable cryptographic backdoors into neural nets, for good or ill?
- Should we expect neural nets to be generically interpretable?
- How can we robustly watermark the outputs of Large Language Models and other generative AI systems, to help identify academic cheating, deepfakes, and AI-enabled fraud?
- Can one insert undetectable cryptographic backdoors into neural nets, for good or ill?
- Should we expect neural nets to be generically interpretable?
Cost
Free
Contact
Accessibility
We value inclusion and access for all participants and are pleased to provide reasonable accommodations for this event. Please call 608-262-4196 (voice only) or email dieter@cs.wisc.edu to make a disability-related accommodation request. Reasonable effort will be made to support your request.