Inspiration and Software

I stumbled across Jake VanderPlas's blog post Analyzing Pronto CycleShare Data with Python and Pandas and thought it could be interesting to do something similar for the data from Washington DC's Capital Bikeshare. I'll also use python, numpy, and pandas, but I've opted for to generate the graphs. I think plotly has some nice defaults and I like the interactive plots that it generates as opposed ot the static images that matplotlib and seaborn make. Another option could be bokeh by the PyData folks or if you really want to get your hands dirty D3.js.

Capital Bikeshare has tons of data going back to 2010. I'm starting here with the first data available, the last quarter of 2010. In the future I'm going to work on some data on a yearly basis and also do some year over year comparisons. Unfortunately, the formatting in the data that Capital Bikeshare provides is not consistent, so it is a little more work than just loading a different CSV file.

In [1]:
import pandas as pd
import re
from datetime import timedelta
import plotly
import xml.etree.ElementTree as et
from math import sin, cos, sqrt, atan2, radians
from geopy.distance import distance
from plotly.graph_objs import Data, Scatter, Layout, Figure, Histogram
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
from IPython.core.display import display, HTML
import gmaps
from IPython.display import Image