PhD Proposal: Improving Deep Networks by Learning from their Failures and Successes
IRB-3137
https://umd.zoom.us/j/3974377195
Over the past decade, Computer Vision has grown from an area of primarily academic research into a true part of people's everyday lives. But with its widespread use have come both successes and failures for deep models, ranging from highly powerful multi-purpose visual backbones to adversarial and backdoor attacks which jeopardize the use of deep models in critical scenarios. Looking ahead to the next decade, it is very likely that the spread of Computer Vision systems in real-world applications will only continue to grow, and so it is more important than ever to learn from both the failures and successes of deep learning models.In this thesis, I will aim to improve our understanding of the inner workings of deep learning models by examining them in a range of contexts, including backdoor attacks, self-supervised learning, and applications to a range of tasks. My research begins in the space of adversarial and backdoor attacks, and then expands into more general network understanding and enhancement. In my further research, I place a particular emphasis on the recently popularized Vision Transformer (ViT) backbone, which is not as well understood as Convolutional Neural Networks (CNNs). This includes an in-depth examination of ViTs trained with different methods of supervision, and a new lightweight method to enhance any ViT's features for dense downstream tasks.