Inference Quality in LLMs: In-Context Learning and Hallucination
Professor Jiawei Zhang (Computer Sciences)
Event Details
In this talk, we will discuss how prompts affect the inference quality of large language models. In the first part, we present a theoretical analysis of in-context learning. In particular, we show that LLMs can generalize to related tasks as well as to combinations of several tasks encountered during pretraining. We further prove that providing sufficiently informative examples in the prompt enables the model to retrieve the corresponding pretrained task, or a task to which it can generalize.
In the second part, we discuss an important cause of hallucination in LLMs: the model’s tendency to attend to incorrect keywords. Even when the relevant knowledge has already been learned during pretraining, the model may still make such errors at inference time. We argue that this issue arises from the nature of the training data and training paradigm. Motivated by our theoretical insights, we construct a dataset consisting of simple questions for which the relevant knowledge is known to have been learned by the model. Even on this benchmark, state-of-the-art LLMs exhibit a nontrivial hallucination rate.
(This talk is part of the weekly Machine Learning Lunch Meetings (MLLM), held every Tuesday from 12:15 to 1:15 p.m. Professors from Computer Sciences, Statistics, ECE, the iSchool, and other departments will discuss their latest research in machine learning, covering both theory and applications. This is a great opportunity to network with faculty and fellow researchers, learn about cutting-edge research at our university, and foster new collaborations. For the talk schedule, please visit https://sites.google.com/view/wiscmllm/home. To receive future weekly talk announcements, please subscribe to our UW Google Group at https://groups.google.com/u/1/a/g-groups.wisc.edu/g/mllm.)
We value inclusion and access for all participants and are pleased to provide reasonable accommodations for this event. Please call 608-334-7269 or email jerryzhu@cs.wisc.edu to make a disability-related accommodation request. Reasonable effort will be made to support your request.