Robust Distortion-free Watermarks for Language Models
This is a joint talk between UW-Madison and Google.
Monday, November 13, 2023
11 a.m.-12 p.m.
We propose a methodology for planting watermarks in text from an autoregressive language model that are robust to perturbations without changing the distribution over text up to a certain maximum generation budget. We generate watermarked text by mapping a sequence of random numbers -- which we compute using a randomized watermark key -- to a sample from the language model. To detect watermarked text, any party who knows the key can align the text to the random number sequence.