MadSystems Seminar -- Roger Waleffe (NVIDIA)

Date

Tuesday, April 14, 2026

Time

4-5 p.m.

Location

Description

Title: NVIDIA Nemotron 3: Model Architecture Design and Pre-Training at Scale

Abstract:
In this talk, I will describe the design and pre-training of Nemotron 3 - NVIDIA’s latest flagship large language models. I will first discuss the Nemotron model architecture (hybrid Mamba-Attention, LatentMoE, Multi-Token Prediction) as well as broader trends in modern LLMs. In particular, I will highlight that architectures are becoming more sparse and heterogeneous. I will then focus on pre-training these architectures at scale and discuss the Megatron-LM software stack and its core parallelism techniques for large-scale training (tensor, pipeline, and expert parallelism). I’ll highlight how the aforementioned architecture trends are stressing the assumptions underlying today’s training infrastructure, opening interesting avenues for future model-system co-design.

Bio:
Roger Waleffe is a senior applied deep learning research scientist at NVIDIA. He works on pre-training of NVIDIA’s Nemotron models, with a specific focus on studying and developing efficient large language model architectures for training and inference. He holds a Ph.D. in Computer Science from the University of Wisconsin-Madison where he worked with Theo Rekatsinas, Shivaram Venkataraman, and Steve Wright. His PhD research focused on the intersection of systems and algorithmic challenges for resource-efficient training of large-scale ML models.

Website

https://madsystems.cs.wisc.edu/seminar.html

Cost

Free

Contact

608-867-6867, tomy.1516@gmail.com

Accessibility

We value inclusion and access for all participants and are pleased to provide reasonable accommodations for this event. Please call 608-867-6867 or email tomy.1516@gmail.com to make a disability-related accommodation request. Reasonable effort will be made to support your request.

Calendar

Click a date to see events on that day.

		July
S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

MadSystems Seminar -- Roger Waleffe (NVIDIA)

Tags

Calendar

Search

Categories

Browse events by tag

MadSystems Seminar -- Roger Waleffe (NVIDIA)

Event Details

Tags

Calendar

View events by date

Search

Search for events

Categories

Browse events by tag