Group 10
Amer Samad | Alejandro Velazquez | Konard Biegaj | Omar Al-Khatib | Jake Arcivar
We used data from Bureau of Transportation Statistics
for our project. This site provides aviation data for all the airports in
the US. We had to download file for each month separately with the following
fields:
FlightDate |
Carrier |
FlightNum |
OriginAirportID |
OriginCityName |
DestAirportID |
DestCityName |
DepTime (local time) |
ArrTime (local time) |
Cancelled |
Cancellation Code |
AirTime |
Distance |
5 causes of delay
After downloading the file we had to convert all the files to tsv format
since the raw data was in csv format(tsv files are easy to parse). We Microsoft Excel to convert
the data from csv to tsv.
After parsing the files we had to do some clean up work inside the files
such as converting factor to dates, replacing Airport ID with Airport names etc.