NYC Film Permit Data EDA

Posted on Feb 8, 2021

NYC Film Permit Data EDA


For my R Shiny Exploratory Data Analysis project, I chose to analyze NYC Film Permit data from 2015 -2021 to investigate trends in film category types by location and year, and how the pandemic affected film production in 2020. The objective of the analysis was to provide actionable insight to entertainment companies by way of detecting which Categories & Subcategories were still acquiring film permits during these unprecedented times and accessing whether these categories would be a good avenue to pivot into. 


With the rise of quarantine restrictions, many of us resort to streaming services to watch our favorite shows and films. My personal expectation of this analysis project was that since people are consuming media more than ever, there must be an increase in film permit acquisitions. An additional expectation was that not only would there be a rise in television and film permit acquisitions, but also web series as well. However, the outcomes demonstrated by the data showed my hypothesis was wrong..


  • Question 1: What is the most popular Category type for each borough?

  • With Television reigning as the most requested film permit type in all five boroughs, it was surprising to see that the Bronx and Staten Island were two of the least popular locations that had any type of film production work in general.
  • In addition, Theater and Film permit requests were more popular in Manhattan and Brooklyn than in the other three boroughs. A possible reason could be the number of production and theater spaces in these boroughs as compared to other locations. These parts of the city may also be attractive because of the number of historical landmarks contained in them,  as well. 
  • Another surprising insight was that even in Staten Island commercial film permits were being requested. 


  • Question 2: What is the most popular subcategory type for each borough?

  • Consistent with the graph above, television and film subcategories were the most popular within each borough, more specifically, cable episodes, episodic series, TV premieres, and feature projects. 
  • One subcategory that is particularly  intriguing is “Not Applicable.” It occurs with high frequencies in both Brooklyn and Manhattan. Delving into what constitutes  “Not Applicable” might be the key in determining what subcategories are not being traditionally represented and worth pivoting to. 


  • Question 3: Was there a considerable change in film permits requests in 2020?

  •  In 2020 there was a drastic decrease in film permit acquisitions. Television – episodic series, remained to be a highly popular subcategory, within the Entertainment Industry. 
  • Cable episodic, commercials, and features decreased as well, though they remain prominent contenders on the list.
  • One subcategory that is worth delving into is “Variety.” Though there is no data describing what constitutes “Variety”, it might be worth delving into this subcategory for future entertainment endeavors. 

About the Data – Conclusion

The data collected for this project was from NYC Open Data. Although film permit data was used to determine what Category / Subcategory of the Entertainment Industry in NYC would be best to pivot into, there are many other factors to consider when making such a decision overall. The dataset used contained many more variables that were not explored, which may provide even more of an insight into where the industry is moving. With this being said, it was a pleasure to work on a project that aligned with my media and performance background, and I am eager to continue building this project and refining my data points.

Each graph involves restructuring the dataset in R, and incorporating it into Shiny- a web framework in R. You can see the code here.

For comments, questions, or suggestions, please feel welcome to send me an e-mail.

Moving Forward I plan to:

  • Add interactivity to my Shiny App. 
  • Include a more robust introduction tab.
  • Incorporate movie/television revenue data to add more dimensionality to my graphs. 
  • Delve into the variables in my dataset that were not showcased. 
  • Find ways to refine my project objective and update the current dataset with current datapoints. 

About Author

Juan R. Vasquez Jr.

Juan is a recent graduate of NYC Data Science Academy where he studied dashboard creation, machine learning, and statistical analysis. His background of three years in the hospitality and commercial art industry allowed him to hone his organization...
View all posts by Juan R. Vasquez Jr. >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup music Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp