Bike New York City Bike Share System by Sylvie
The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Please check my work on the New York City bike share system here (it is decomposed in 3 posts: http://rpubs.com/Sylvie/
And my Github is here: https://github.com/slardeux
Work by Sylvie Lardeux, Student of Feb - April Bootcamp, 2015
The best way to illustrate whether or not at least a bike would be in a given bike station in the New York city bike share system is through a shiny app. I took the GBM model created previously and imported it in this shiny app to predict whether or not a bike will be at a station at hte date and time inputed by the user.
I used the leaflet-shiny package created by Joe Cheng to create the map.
Shiny app ui.R
library(shiny)
library(leaflet)
library(ShinyDash)
library(markdown)
shinyUI(navbarPage(title = "City Bike NYC",
collapsible = TRUE,
windowTitle <- 'CityBike',
tabPanel("About",
fluidRow(
column(6,offset = 3,
includeMarkdown("doc/about.md"))
)),
tabPanel("Map",
column(3,
wellPanel(
dateInput('date', label = 'Select a date'),
selectInput('hour', label = 'Select an hour',
choices = c('1','2',"3",'4','5',"6",'7','8','9','10','11','12','13','14','15','16','17','18','19',"20",'21','22','23', '0'),
selected = '12', width = '100%'),
sliderInput('time', label = 'Select a time range',
min = 0, max = 60, value = c(0,10), step = 10)
)
),
column(9,
leafletMap(
"map", "100%", 800,
initialTileLayer = "//{s}.tiles.mapbox.com/v3/slardeux.lda667h9/{z}/{x}/{y}.png",
initialTileLayerAttribution = HTML('Maps by <a href="http://www.mapbox.com/">Mapbox</a>'),
options=list(center = c(40.736, -73.99), zoom = 14)
)
)
) #end tabPanel
)
)
When the user select a date and time, the app get them and use the get_data.f function to find the prediction. This function calls the get_bin.f function and the get_all.f function (in the case when more than 1 10 minute bin was selected). These function return a data set composed of all the bike station ID, the date and the time; this data set will be use for prediction.
The get_data.f use the monthFit model obtained by running the gbm previously to predict the new data set. Finally, I created a map and added circle to each station location with a color green if the model predict that there is at least a bike and red if not.
Shiny app ui.R
library(shiny)
library(maps)
library(lubridate)
library(gbm)
station <- read.csv('data/station.csv')
station <- apply(station, 2, as.numeric)
station <- as.data.frame(station)
load('data/monthFit')
###########################################################################################################
## Create the data to predict
#########################################################################################################
get_bin.f <- function(df, date, hour, rng){
newd <- data.frame(month = month(date), hour = as.numeric(hour), dayofweek = format(date, '%a'), min_block = rng)
newd$weekend <- ifelse(newd$dayofweek %in% c('Sat', 'Sun'), 1, 0)
newd$rush <- ifelse(newd$hour %in% c(7,8,9, 17, 18, 19) & newd$weekend == 0, 1, 0)
newd$night <- ifelse(newd$hour %in% c(21:23,0:6), 1, 0)
newd <- newd[,c(1:3,6:7,5,4)]
newdata <- data.frame(endid = df$id, newd)
return(newdata)
}
get_all.f <- function(df, date, hour, rng, tm, len){
s <- seq(tm[1], tm[2], 10)
l <- list()
for(i in 1:len){
r <- paste0('X', s[i], '.', s[i+1])
l[[i]] <- get_bin.f(df, date, hour, r)
}
df <- data.frame(do.call(rbind, l))
return(df)
}
##################################################################################
# Main function to get the data to predict and the prediciton to plot
#####################################################################################
get_data.f <- function(df, date, hour, tm){
nbin <- (tm[2] - tm[1])/10
rng <- paste0('X',tm[1], '.', tm[2])
if(nbin == 1){
newdf <- get_bin.f(df, date, hour, rng)
}else{
newdf <- get_all.f(df, date, hour, rng,tm, nbin)
}
mn <- as.numeric(month(date))
fit <- monthFit[[mn]]
p <- predict(fit, newdata = newdf, n.trees = 500, type = "response")
pred <- ifelse(p > .5, 1, 0)
res <- data.frame(station, p = pred)
return(res)
}
shinyServer(function(input, output, session) {
####################################################################################
## create map
#####################################################################################
map <- createLeafletMap(session, 'map')
add_circle.f <- function(df, col){
map$addCircle(
df$lat,
df$lon,
50,
row.names(df),
list(
weight=1.2,
fill=TRUE,
color= col,
fillOpacity = 0.5
)
)
}
observe({
map$clearShapes()
stat <- get_data.f(station, input$date, input$hour, input$time)
if (nrow(stat) == 0)
return()
#add circle on the map
stat1 <- stat[which(stat$p == 1),]
add_circle.f(stat1, '#00FCA0')
stat0 <- stat[which(stat$p == 0),]
add_circle.f(stat0, '#FC0000')
})
})
The app has also an about page (a markdown document) to explain how to use it and where it came from. The user needs to first chose a date/time before any point are drawn on the map.