Inspiration and Software¶
I stumbled across Jake VanderPlas's blog post Analyzing Pronto CycleShare Data with Python and Pandas and thought it could be interesting to do something similar for the data from Washington DC's Capital Bikeshare. I'll also use python, numpy, and pandas, but I've opted for plot.ly to generate the graphs. I think plotly has some nice defaults and I like the interactive plots that it generates as opposed ot the static images that matplotlib and seaborn make. Another option could be bokeh by the PyData folks or if you really want to get your hands dirty D3.js.
Capital Bikeshare has tons of data going back to 2010. I'm starting here with the first data available, the last quarter of 2010. In the future I'm going to work on some data on a yearly basis and also do some year over year comparisons. Unfortunately, the formatting in the data that Capital Bikeshare provides is not consistent, so it is a little more work than just loading a different CSV file.
import pandas as pd import re from datetime import timedelta import plotly import xml.etree.ElementTree as et from math import sin, cos, sqrt, atan2, radians from geopy.distance import distance from plotly.graph_objs import Data, Scatter, Layout, Figure, Histogram from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot from IPython.core.display import display, HTML import gmaps from IPython.display import Image import plotly.plotly as py plotly.offline.init_notebook_mode()