Data Scientist: Scraping Glassdoor Interview Reviews

Posted on Feb 16, 2020
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

LinkedInGitHubEmail | Data | Web App

Congratulations! You have been selected to interview for...


It’s a situation that we’ve all experienced before. The excitement of securing an interview for a promising job opportunity along with the preparation that follows. In such situations, there are many resources that job candidates can utilize, but in the humble opinion of this data science fellow, one stands out above the rest – Glassdoor. With this is in mind, I sought to scrape interview reviews for 3 of the FAANG (Facebook, Amazon, Apple, Netflix, and Google) companies. For the purposes of this exercise, Amazon and Apple were omitted due to interviews pertaining to logistical (i.e. warehousing for Amazon) and retail (i.e. Apple Genius for Apple) roles being commingled with broader interviews.  

As an aspiring data scientist, the question that I sought to answer were the following:

  • How difficult is it to interview at Facebook, Netflix, and Google (FNG)?
  • What is Data Science recruitment like at FNG?
  • Is there any seasonality in hiring?
  • How did successful interviewees perceive their experience? Unsuccessful interviewees?

Please join me in this data analysis to uncover insights about the recruitment process at FNG.

Scraping Method

In order to scrape the data utilized in this analysis, I used Selenium to scrape components of an interview review on Glassdoor. Key elements of reviews scraped totaling over 20,000 rows are as follows:

  • Company: Company Name
  • Title: Role that is being interviewed for
  • Date: Date of interview
  • Offer Status: Whether an offer was given (offers given can be accepted or declined)
  • Interview Difficulty: Categorical variable indicating interview difficulty (1: Easy, 3: Average, 5: Difficult)
  • Interview Experience: Categorical variable indicating overall experience (1: Negative, 3: Neutral, 5: Positive)
  • Application Overview: Review written by the interviewee]


How difficult is it to interview at FNG?

To best visualize the difficulty, this violin plot shows how interviewees perceived the difficulty of their interviews. Overall, it appears that for Facebook and Netflix, the difficulty appears to be normally distributed with most interviewees agreeing that their interviews were of average difficulty.

For Google, we see a different story, a left skew, indicating that more candidates found their interviews to be of the average / difficult variety.

Having said this, one thing to note here is the thinner width of the Netflix violin, which when juxtaposed with its peers may indicate that the findings of this exercise as they relate to Netflix may not be significant due to a smaller sample size.


What is Data Science recruitment like at FNG?

For self-serving purposes, the next logical question that comes to mind is to understand how data science recruitment is for FNG. The quick answer from analyzing interviews with “Data Scientist” as the title is simply that yes, it is more challenging to get employment as a Data Scientist. Data science offer rates at FNG are lower than total offer rates for all companies, with both average difficulty and average overall experience being a mixed bag.

The most interesting figure shown above is the average overall experience for Netflix (1.89), which appears to be significantly lower than that for Facebook (3.55) and Google (4.11). However, this figure may not be indicative of the truth as Netflix’s dataset size is significantly smaller than that of its peers.


Is there any seasonality in hiring?

Moving along, the next mystery to solve is whether there is any seasonality in hiring. Through grouping interviews by month, we can analyze the offer rates. To determine whether there is a “hot” time for hiring for FNG.

For Facebook, we see that hiring stays relatively constant across the year with hiring rates oscillating between 20% and 35%.


For Netflix, we see a similar story, albeit with hiring rates oscillating between a slight larger range.


With Google, we see a different trend with hiring spiking at September and December. Perhaps this can be accounted for by the theory that hiring occurs most before the holidays and new year.




How did successful interviewees perceive their experience? Unsuccessful interviewees?

The final question to ponder pertains to whether candidates that did and did not receive offers viewed their experiences differently. We can see that for FNG, with the exception of Netflix, overall experience ranges primarily lie in the positive experience range. What we can conclude from this notion is subjective, but it is my view that the interview process is a two-way street. Not only are companies interviewing candidates, candidates are also interviewing companies. Thus, making it in the best interest of both parties to put forth their best impressions.



All in all, the insights gleaned from this web scraping exercise can be summarized below: 

  • Most candidates find interviews to be of average difficulty (Google is an exception)
  • Securing a job as a data scientist is less generally more challenging.
  • For Google, it may be advantageous to recruit in September and December.
  • Interview experiences tend to be more positive in general.

About Author

Richard Choi

Richard Choi is a Data Science Fellow in the January 2020 cohort at NYC Data Science Academy. He has 6 years of experience working for Global 500 companies such as HSBC and Unilever and is proficient in Python,...
View all posts by Richard Choi >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI