Skip to main content

Deploying RAG (cloud vs. GB10): WattBot 2025 case study

ML+X Forum

Event Details

Date
Tuesday, February 17, 2026
Time
12-1 p.m.
Location
Description

Please fill out the registration form if you plan to attend (Glass Nickel Pizza provided in-person). Hope you can join us! 

When: Tuesday, Feb 17, 12-1pm CT
Where: Orchard View, Discovery (and Zoom)

Summary: Many groups across campus are exploring retrieval-augmented generation (RAG) to build document-grounded, trustworthy AI tools, but it is often unclear how design choices around models, infrastructure, and deployment play out in practice. In this session, we present lessons learned from replicating the winning RAG system from the WattBot 2025 challenge. This challenge focused on producing citation-backed energy and sustainability estimates for AI workloads from a fixed corpus of 30+ academic papers, or explicitly abstaining when evidence was missing. The winning approach relied on hierarchical document embeddings and ensemble voting to maintain accuracy and citation discipline. After a short (~10 min) overview of the winning approach, Nils Matteson and Blaise Enuh will discuss how this system was implemented in practice, including a (1) cloud deployment using AWS Bedrock and (2) local, open-source efforts (e.g., Hugging Face models on GB10-class hardware). We will share lessons learned around performance, cost, latency, and operational tradeoffs. If time permits, we will also demo a Streamlit-based interface designed for longer-term hosting, aimed at making RAG tools accessible to stakeholders. The session will end with open discussion, and we encourage others to share their own RAG experiences and questions.

Finding Orchard View: The Orchard View room is located on the 3rd floor of Discovery Building—room 3280. To get to the third floor, take the elevator located next to Aldo’s Cafe kitchen (see photo). 

Join ML+X / Share Your Work! This talk is part of a monthly forum hosted by the ML+X community at UW-Madison. Join our Google group to be notified of future events. Better yet, sign up to discuss your ML/AI work at the monthly forum! The majority of our attendees are applied practitioners from diverse fields, not AI/ML purists looking to critique. Contact endemann@wisc.edu if you have any questions.

Cost
Free

Tags