Demystifying the Inner Workings of Language Models

Talk

Sarah Wiegreffe

Talk Series:

Visitors

Time:

02.26.2025 11:00 to 12:00

Location:

IRB 4105 or https://umd.zoom.us/j/94340703410?pwd=rrXaGSXSpabcMTtDNmeCNf2Ih2fQYE.1

URL:

https://talks.cs.umd.edu/talks/4123

Large language models (LLMs) power a rapidly-growing and increasingly impactful suite of AI technologies. However, due to their scale and complexity, we lack a fundamental scientific understanding of much of LLMs’ behavior, even when they are open source. The “black-box” nature of LMs not only complicates model debugging and evaluation, but also limits trust and usability. In this talk, I will describe how my research on interpretability (i.e., understanding models’ inner workings) has answered key scientific questions about how models operate. I will then demonstrate how deeper insights into LLMs’ behavior enable both 1) targeted performance improvements and 2) the production of transparent, trustworthy explanations for human users.

Demystifying the Inner Workings of Language Models

Talk

Talk

Talk

Event

Event

Event

Talk

Event

Event

Talk