Abstract
We investigate the feasibility of classifying UFO
sightings in unsupervised and semi-supervised settings. Using data scraped from
the National UFO Reporting Center website, we apply both K-means and
self-training with SVM wrapper functions. Our results are two-fold. On the
negative side, we show that K-means does not effectively cluster the sightings
according to our projected labels. On the positive side, we find that
self-training is an effective method at classifying sightings; we achieve an
accuracy of 43% when using a linear-SVM wrapper function.
Report
Report
Data + Code
Data + Code [71M]
Movies
UFO sightings between 01/01/2000 and 08/31/2012,
colored by shape [39M]
UFO sightings between 01/01/2000 and 08/31/2012,
colored by classification (self-learner with linear-SVM) [44M]