Abstract
We investigate the feasibility of classifying UFO sightings in unsupervised and semi-supervised settings. Using data scraped from the National UFO Reporting Center website, we apply both K-means and self-training with SVM wrapper functions. Our results are two-fold. On the negative side, we show that K-means does not effectively cluster the sightings according to our projected labels. On the positive side, we find that self-training is an effective method at classifying sightings; we achieve an accuracy of 43% when using a linear-SVM wrapper function.
Report
Report
Data + Code
Data + Code [71M]
Movies
UFO sightings between 01/01/2000 and 08/31/2012, colored by shape [39M]
UFO sightings between 01/01/2000 and 08/31/2012, colored by classification (self-learner with linear-SVM) [44M]