Many movies have been filmed in New York City, but it is difficult to get a sense for where in the city these scenes occurred. I used a dataset based on the book Scences from the City by James Sanders, available on NYC Open Data, to create a navigable map which displays this information. This dataset included coordinates, allowing me to pinpoint exact locations. My goal was to visualize where in the city movies were filmed as well as provide additional information from IMDB, such as rating, poster images, and direct IMDB links. For this project I used R to clean and aggregate the data and Shiny to visualize it.



Film Locations

Film scene locations for movies filmed in New York were obtained from the NYC Open Data website:

This dataset is based on the book Scenes from the City by James Sanders:

This dataset does not include every movie filmed in New York  (which would be difficult, yet interesting to collect!) and does not include movies filmed after 2006. This is a limitation that I hope to address in the future, but is important to keep in mind for the current application.


Kaggle IMDB-5000

This data set was created by Chuan Sun, who scraped data from the IMDB website. The data set provided me with ratings for some of the movies in my dataset:



I also installed the ggplot2_movies package which allowed me to access the movies dataset, which provided additional IMDB ratings that were not present in the Kaggle data set.


Data Aggregation and Manipulation

Data Cleaning

In order to join the three datasets, the IMDB kaggle dataset required minor cleaning. First, I changed the titles from being factors to strings. After doing so, there was still white space following the title, which would have caused issues when joining the data sets, so this was removed. The film locations dataset also required  the column "Year" to be changed to an integer.

Joining Databases

After cleaning, two left joins were performed, with the left table being the Movie Locations dataset, so that no films in the locations data set would be removed.


Shiny Dashboard

Interactive Map with active filtering

The core of the application uses leaflet to allow the user to zoom in and out of a map of New York with markers indicating the locations of scenes from movies. On the side panel, sliders for both the year of release and IMDB score allow the user to filter markers with a great degree of specificity.

Not filtered


Filtering by IMDB Score:

Filtered by IMDB Score


Filtering by Year:

Filtered by Year


Clicking on marker produces pop-up with information about the movie, and a direct link to the IMDB site for the movie

Clicking on a popup provides information related to the movie


Database with active filtering

The Data panel allows users to view the database, which can also be filtered by year and IMDB score. From this panel, users can sort by column or perform a search.

Interactive Data Table


The By Group tab allows the user to group by director, borough, or neighborhood

Group by Director


Graphs with active filtering

The user can view histograms (shown below), box plots or scatter plots of continuous variables, such as IMDB score, Budget, Gross income, and movie length

Histogram of IMDB Scores

Screen Shot 2017-02-05 at 5.29.41 PM


Although an incomplete data set, only using films from the book Scences from the City, the application provides an excellent overview of locations around New York where movies have been filmed. It allows users to filter by Year and IMDB score on an interactive map as well as group by functions and graphical displays of variables. Unsurprisingly, the vast majority of movies are filmed in Manhattan and achieve around a 7 rating on IMDB. Interestingly, there was a small dip in the number of movies filmed in the city between the years of 1975 and 1990, possibly related to increases in criminal activity during this time period.



(crime rate image from Reddit)

