Data Study on Employee Performance Improvement

Posted on Jun 12, 2020
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Data Study on Employee Performance Improvement

Human Resource Background:

Human Resources is a department that has become vital to any organization or company that has more than a few employees. One of the purposes of this department is to hold all employee data from an employee’s personal information, to how many times they have been absent, to performance review. This data is not normally viewed by anyone other than the managers and higher level employees, which makes the data pretty hard to come by. The data is normally used when determining promotions and who gets laid off, but what if a company was able to tell what factors make their employees perform better at their jobs.

Human Resource App:

Many companies are always striving to find what factors they can tweak to allow their employees to be the best they can be. I set out to find out what these factors are and to create an app that would allow any company to input their Human Resource data and determine how these factors influenced performance at their company. In order to do this, I obtained a dataset of random Human Resource data from and created a list of metrics to gauge the employees by. The dataset had 3 very interesting feature that I decided to use as metrics

  • Engagement Survey - Survey of how motivated and engaged the employee was (out of 5)
  • Performance Score - Employee’s performance score from their recent performance review (out of 4)
  • Employee Satisfaction - Survey of how satisfied the employee is at the job (out of 5)

I used these three metrics against the different factors I came up with in order to determine what factor would best influence the metrics.



I first started with Managers. I felt that many times working under someone that inspires the people underneath them or overall is a great manager would normally influence employees to perform better. In order to test this I ran an analysis that compared the employees of each manager with the metrics above. An example of the results are shown below. 

Data Study on Employee Performance Improvement Data Study on Employee Performance Improvement

As one can see, there was variance between difference employees and how well they performed on their performance reviews, as well as how satisfied they were. This was the same for essentially every Manager.

In order to compare all the Managers against one another, I ran an analysis of each manager vs their employees Engagement Survey scores, as shown in the graph above. As one can probably see, there was no clear “great manager” or “bad manager”. It also was a bit skewed due to some Managers only having a few employees vs others having many. I decided to move on to the next factor. 


The next factor I decided to analyze was department. There are many times that employees in certain departments tend to perform very well, while other departments as a whole do not. I set out to see whether this was the case for this dataset. An example of the results are shown below.

Once again as one can see, there is variance within the department as for how well employees in the department performed on their performance review, as well as how satisfied they were within their jobs. This shows that (at least for this dataset) the department wasn’t a determining factor.

Employee by Deaprtment

In order to look at how each department’s employees did against each other I ran an analysis of the Engagement Survey of each employee by department, the results are shown above. Once again, one can see that there is no real relationship between this datasets Departments and how well the employees performed. I then moved on to the third factor, pay-rate. 

One would assume, the more an employee gets paid, the better the performance of the employee should be.  In order to determine this I ran an analysis of each employee's pay-rate vs the three separate metrics. Below is an example of the results.

Surprisingly, it seems that pay-rate also didn’t matter when it came to employees performance scores or employee satisfaction.


The scatter-plot above shows the different pay-rates vs the employee Engagement Survey scores. This essentially determined that pay-rate had no real effect on how well the employees in this dataset performed. The final factor I looked into was the age of an employee. 

Many times when someone is younger and just getting into the workforce they can be seen one of two ways, either they are hungry and ready to work, or they really don’t want to be at the job. The same can be said about older workers. Many older workers can be seen as experienced and probably perform better at a job due to that experience, or because they have been at a job for so long, they tend to be a little more relaxed. I decided to find out by running age through the 3 metrics. Below is an example of my results.

Interestingly enough, once again it seems that an employees age did not affect their performance score or their satisfaction as an employee. 

Just like the other factors before, I ran an analysis of all the different ages of employees vs employee Engagement Survey scores, and once again there was a large amount of variance. This once again meant that age was not a factor in this dataset when it came to employee performance. 

Data Analysis & Conclusion:

In conclusion, after running the factors through the three metrics, none of them turned out to be an actual factor when determining employee performance. That being said, it doesn’t necessarily mean that it would not work for another company’s Human Resource dataset. Given more time and resources, I would’ve really liked to work with various datasets in order to have more variety in the data. With more data, I would’ve had a lot more to compare and a larger sample size never hurts.

About Author

Anthony Ali

Anthony Ali has a background in Cyber-Security with a Bachelors degree in Network Forensics and Intrusion Investigation. He is a CISSP certified security analyst with industry skills in threat management, security risk identification and mitigation, and security infrastructure....
View all posts by Anthony Ali >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI