A Visualization with Citibike Data

Posted on Nov 4, 2019


Bike-shares have become popular in major cities all across the US. Citibike in NYC is among one of the most popular. It's a convenient and fun alternative to taking the bus or subway.

For this Shiny project I was interested in utilizing Citibike's public database to bring this information to life through visualization. I was curious in bike station user activity with respect to a chosen hour via a heat map. Additionally I wondered what was the journey of a given bike throughout any chosen day.


Application and Results:

For this app there are 3 tabs to choose from; density map, trip map and data. The Density map tab contains two displays. On the left is a map of NYC and the right is a histogram. Additionally there are two selection boxes where you can choose the day and hour of interest. The map on the left displays the the number of users that checked out bikes from stations within the selected hour. This is displayed as purple clouds, the darker and larger the cloud relates to more users active at that station. On the right is a histogram displaying the number of users for any given hour of that selected day.


Figure 1: Density map

From the histogram we can see there are two peaks. One in the morning and one in the evening. This is not such a surprise given that the date chosen is a typical weekday in the mild month of September. Also notice the hour chosen is 5 pm and the density map is saturated in Manhattan, but sparse in Brooklyn or Queens. This suggests most users are leaving work from Manhattan.


The next tab is the trip map. It contains a single map display and two selectors. The selectors allow you to choose a bike ID and date. With these inputs selected the map will display that bike's entire journey for that day from all users. The red marker is the starting point and blue is the final end point. The travel path gets darker for multiple trips over the same path and lighter for single trips. Also it's worth mentioning here that the path displayed is the suggested bicycling route via google directions. It is the most likely path taken, but not the true paths taken by all users.

 Figure 2: Trip map


Future work:

This app is convenient and easy to use, but there is far more room for expansion. For future editions the user will be able to graphically see the range traveled of a specific bike given a selected duration of time, not just a day. This can be displayed via a histogram or the trip map. This will give insight if bikes generally stay in one location or cycle through different boroughs.

Another feature to add is a play button to observe the user density map throughout the way.

Additionally this app can be scaled up to multiple bike-shares in other cities. This is very possible given most other bike-shares have public data sets with similar information and formatting.


Thanks for reading!

About Author

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp