Visual Intelligence Online Seminar #91: Sparse Tokens and Patch Representations

Illustrasjons-/bannerbilde for Visual Intelligence Online Seminar #91: Sparse Tokens and Patch Representations
Torger Grytå / Petter Bjørklund

Presented by Martine Hjelkrem-Tan, PhD Research Fellow at the Digital Signal Processing and Image Analysis at the University of Oslo

Abstract

In this talk, we discuss how sparsity and sampling can be used for improving modern vision models. We show that these principles can help us discover which regions of an image a model chooses to attend to perform a given task. Secondly, we show that self-supervised foundational models exhibit unwanted positional noise in patch tokens and propose a simple cleaning method. Finally, we discuss how these findings can help guide future frameworks for training foundational vision models.

When: 12.03.26 kl 13.00–14.00
Where: Online
Location / Campus: Digitalt
Target group: Employees, Students, Guests
Contact: Petter Bjørklund
Phone: 90218792
E-mail: petter.bjorklund@uit.no

Sign up here
Add to calendar