Your final project is to create a heatmap of UFO sightings. You will use data from the National UFO Reporting Center and the Google Maps API to do this.
The National UFO Reporting Database has an index of UFO sightings. For this project, you can use either all sightings or all North American sightings (i.e. excluding "UNSPECIFIED/INTERNATIONAL" sightings on the state list). Your first step will be to create a perl script that extracts the location for each sighting in the database.
IMPORTANT NOTE: be a responsible data user! DO NOT UNDER ANY CIRCUMSTANCES WRITE CODE THAT REPEATEDLY DOWNLOADS DATA FROM THEIR SITE. You should NOT have the "get URL" code in your test code. It is an abuse of their servers and is totally inappropriate. I will likely hear about it if you do this, so don't. There will be consequences. Instead, save a version of one of the web pages you need to parse and use that local version for all your testing. Only convert your code to access the online data once it is working perfectly on the local page.
Many cities will appear multiple times. You do not want to list the cities over and over. Instead, store them in a hash and keep track of how many times you see them (increment a count for each time you encounter the city/state).
At the end of this step, you should have a list of city,state (or city,province or city,country if you use international locations) where there were sightings and a count of each time that city appeared. Use a tab to separate the city and the count. Your file should look like this:
Chicago,IL 5 Washington,DC 7
Deliverable: A perl file called lastname_firstname_final_1.pl that I can run by typing "perl FILENAME". It should output a file called lastname_firstname_cities.txt that contains one city/state (or city,province or city,country if you use international locations) with the corresponding count on each line.
Here's the sample code from class
To do this (and the next step), you will need an API Key. To get that, go to https://developers.google.com/maps/documentation/javascript/heatmaplayer. At the top of the page, click "Get A Key'. Click Create New Project in the pull down and give your project a name. Then click "Create and Enable API". This will generate a key for you that you can use in the places google indicates you need to include YOUR_API_KEY.
At the end of this step, you should produce a list of latitudes and longitudes and a corresponding count of UFO sightings.
Deliverable: A perl file called lastname_firstname_final_2.pl that I can run by typing "perl FILENAME". It should open your file lastname_firstname_cities.txt from the current directory (DO NOT put a full path to the file in your code. Just use the file name so it will work on my system). It should output a file lastname_firstname_latlon.txt that has one latitude/longitude pair on each line with its corresponding count that matches with the city/state on each line of your lastname_firstname_cities.txt file.
At the end of this, you should have an HTML document that I can open and see a heatmap of UFO sites.
Deliverable: A perl file called lastname_firstname_final_3.pl that I can run by typing "perl FILENAME". It should open your file lastname_firstname_latlon.txt from the current directory (DO NOT put a full path to the file in your code. Just use the file name so it will work on my system). It should output a file called lastname_firstname.html. When I open the html file, it should show me a heatmap of the UFO sightings.
If you have another UFO related topic you'd like to explore, let me know.
Note: you do not have to believe aliens are visiting us for this project. This is about exploring a semi-structured dataset and the presentations augment that numeric data that we are processing in perl with a deeper content-based exploration. While the topic here is admittedly a bit silly, this process is representative of how you should do most analysis. We could, for example, be extracting information from the Enron email dataset and plotting that in a visualization system, followed by reading emails and processing the content (this is, in fact, the subject of many academic papers and a big assignment I give in my network analysis class). In other words: don't be fooled by this whimsical topic. This final is giving you practice with all stages of web data processing with code and deeper contextual analysis you need to be a thoughtful analyst.
You will be graded on how interesting your presentation is, so practice and make it fun. Feel free to incorporate existing video clips, photos, and reports (but don't just show us a 5 minute youtube video someone else made!).