Hubway Station Metrics
In 2014 Boston held a data visualization challenge. They asked users to look at ridership statistics in new and exciting ways. Boston wanted to know what insight could be gained by crowdsourcing data scientists to look at the activity of their fleet of ride-share bikes on the Hubway network.
Hubway made all of their data public for the first three years of their operation. They posted data on every single instance of a user taking a bike from Station A to Station B. They also included user information on registered riders.
I approached this project with the aim to build a station statistic tool. An interface to allow station operators and riders to examine the usage of a given station. This app would provide insight into which types of users frequent the station, the amount of traffic throughout the day, and where users were coming from or going to.
As cycling is a seasonal activity, I limited my data to only examine rides taken in 2012 and examined the annual ridership. I also removed all trips that had a duration fewer than a few minutes as this likely indicated false starts.
I built my app into an interactive Shiny App. The tool allows you to select any station on its network and immediately see graphs showing the Net Traffic through the day, the portion of riders that are commuter or casual, and the gender ratio of riders.
Users can also toggle the time period to examine, from hours in the day, months in the year or days in the week. This allows users to see differences in peak activity. We can use this to confirm that the most active months are in the summer, most active hours are 8 am and 6 pm. Additionally, you can toggle whether you want to examine weekends or not. As commuter traffic drops sharply on the weekends, this can show rider demographics and activity patterns that are drowned in the noise.
The map also updates with the top 10 start and end stations users are most likely to go between this station. Paths that are arriving at this station are marked in blue, whereas paths leaving from this station are in red. This can be used to identify sister stations.
Ideally, these tools would be used best to identify stations that historically are underused or underserviced, rider demographics and peak activity times. Station operators can use this app to determine when and where to rebalance the network, and how to advertise to its users based on need. For example, stations identified as heavy casual usage, are likely major tourist locations, which can be tapped for advertising local shows or events that commuters are less interested in.
If you are interested in this project or others like it, please visit my GitHub.