Filming Locations around New York City - Visualization using Shiny Dashboard

Avatar
Posted on Feb 2, 2017

Introduction

Many movies have been filmed in New York City, but it is difficult to get a sense for where in the city these scenes occurred. I used a dataset based on the book Scences from the City by James Sanders, available on NYC Open Data, to create a navigable map which displays this information. This dataset included coordinates, allowing me to pinpoint exact locations. My goal was to visualize where in the city movies were filmed as well as provide additional information from IMDB, such as rating, poster images, and direct IMDB links. For this project I used R to clean and aggregate the data and Shiny to visualize it.

 

Datasets

Film Locations

Film scene locations for movies filmed in New York were obtained from the NYC Open Data website: https://data.cityofnewyork.us/Business/Filming-Locations-Scenes-from-the-City-/qb3k-n8mm

This dataset is based on the book Scenes from the City by James Sanders: https://www.amazon.com/Scenes-City-Filmmaking-New-York/dp/0847828905

This dataset does not include every movie filmed in New York  (which would be difficult, yet interesting to collect!) and does not include movies filmed after 2006. This is a limitation that I hope to address in the future, but is important to keep in mind for the current application.

 

Kaggle IMDB-5000

This data set was created by Chuan Sun, who scraped data from the IMDB website. The data set provided me with ratings for some of the movies in my dataset: https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset

 

GGPlot2_movies

I also installed the ggplot2_movies package which allowed me to access the movies dataset, which provided additional IMDB ratings that were not present in the Kaggle data set.

 

Data Aggregation and Manipulation

Data Cleaning

In order to join the three datasets, the IMDB kaggle dataset required minor cleaning. First, I changed the titles from being factors to strings. After doing so, there was still white space following the title, which would have caused issues when joining the data sets, so this was removed. The film locations dataset also required  the column "Year" to be changed to an integer.

Joining Databases

After cleaning, two left joins were performed, with the left table being the Movie Locations dataset, so that no films in the locations data set would be removed.

 

Shiny Dashboard

Interactive Map with active filtering

The core of the application uses leaflet to allow the user to zoom in and out of a map of New York with markers indicating the locations of scenes from movies. On the side panel, sliders for both the year of release and IMDB score allow the user to filter markers with a great degree of specificity.

Not filtered

 

Filtering by IMDB Score:

Filtered by IMDB Score

 

Filtering by Year:

Filtered by Year

 

Clicking on marker produces pop-up with information about the movie, and a direct link to the IMDB site for the movie

Clicking on a popup provides information related to the movie

 

Database with active filtering

The Data panel allows users to view the database, which can also be filtered by year and IMDB score. From this panel, users can sort by column or perform a search.

Interactive Data Table

 

The By Group tab allows the user to group by director, borough, or neighborhood

Group by Director

 

Graphs with active filtering

The user can view histograms (shown below), box plots or scatter plots of continuous variables, such as IMDB score, Budget, Gross income, and movie length

Histogram of IMDB Scores

Screen Shot 2017-02-05 at 5.29.41 PM

Conclusions

Although an incomplete data set, only using films from the book Scences from the City, the application provides an excellent overview of locations around New York where movies have been filmed. It allows users to filter by Year and IMDB score on an interactive map as well as group by functions and graphical displays of variables. Unsurprisingly, the vast majority of movies are filmed in Manhattan and achieve around a 7 rating on IMDB. Interestingly, there was a small dip in the number of movies filmed in the city between the years of 1975 and 1990, possibly related to increases in criminal activity during this time period.

 

murder_rate

(crime rate image from Reddit)

About Author

Avatar

Daniel Epstein

Daniel Epstein is a neuroscience PHD candidate at the University of Utah, expecting to graduate in summer 2017. While performing analyses on behavioral and neuroimaging data, he became interested in utilizing data science to understand human behavior and...
View all posts by Daniel Epstein >

Related Articles

Leave a Comment

Your email address will not be published. Required fields are marked *

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags