PhD Defense: Vector Kernel Representation Theory, Applications to Point Clouds and Event Cameras

Talk
Dehao Yuan
Time: 
03.13.2025 11:00 to 12:30
Location: 

IRB IRB-3137

Point set representation is a fundamental problem in fields such as point cloud processing and event-based vision. A point set is an unordered collection of points in Euclidean space, typically generated by sensing technologies like 3D scanners or event cameras. The goal of point set representation is to transform these unordered points into fixed-length vectors suitable for machine learning tasks. A key challenge in this process lies in handling point sets of varying sizes and uneven distributions. This requires representation methods that scale effectively to large point sets while remaining resilient to noise and variations in point density.Existing point set representation methods generally fall into two categories: point-based and voxel-based approaches. Point-based methods, like PointNet and its variants, use multi-layer perceptrons to convert point clouds into fixed-length vectors. However, they often struggle with noise and variations in point density, especially when these variations were unseen in the training set. These methods can also be computationally expensive and do not scale well to large point sets, such as event streams from event cameras. Voxel-based methods, on the other hand, voxelize point sets into regular grid structures, allowing for convolutional operations. While methods like E-RAFT show promise, voxel-based approaches often suffer from performance drops when point set densities vary and face challenges with generalizability across different domains.This thesis introduces Vector Kernel Representation (VKR), a novel framework for point set representation that addresses the limitations of existing methods. We first establish Vector Kernel Representation Theorem, proving that Gaussian kernel mixtures can be represented as fixed-length complex vectors through a predefined operation. Building on top of the theorem, VKR is applied to various tasks, demonstrating superior accuracy and computational efficiency. In point cloud encoding, VKR significantly improves performance and reduce computational costs. When extended to event-based normal flow estimation, VKR significantly improves accuracy and generalizability compared to voxel-based methods. The thesis concludes by highlighting potential future directions, including time series modeling, integrating VKR with temporal architectures for event-based tasks, and advancing moving object segmentation through robust normal flow estimation.