Employee Attrition Analysis
Employee productivity and efficiency are major areas of focus for many of today’s top corporations. A company with highly efficient employees gains the benefit of greater output, better products, and superior customer relations. Due to these advantages, vast resources are often used to ensure rapid deployment of new hires into the roles they were brought on to fill. However, even with excellent training programs in place, management will always see a transitional loss in efficiency due to the new employee needing time to acclimate and learn the ins and outs of their new environment.
Due to that inevitable loss of productivity when onboarding a new employee, to maintain a consistent level of performance, employers should aim to minimize the necessity of new hires, and do whatever they can to retain the current ones who already know the ropes. But how can this be achieved? By looking at data provided by IBM data scientists, this project attempts to find trends that could help provide the insight employers need not only to hold on to the productive employees they do have, but to identify the traits of individuals that would provide the most stability for the company.
The dataset analyzed for this project was found on Kaggle, and was developed as a study resource by IBM data engineers. The data is build on both personal employe information and information about the position that employee held for nearly 1500 individuals. The individuals surveyed represent numerous industries and roles to provide broad insights that could be applied to most industries. For this project, the data columns were separated into three buckets: employee background, employment information, and employee satisfaction.
The breakdown is as follows:
- Education Level
- Commute Length
- Business travel
- Overtime Y/N
- Job Satisfaction (Composite Score)
The initial hypothesis for this investigation is that an employee’s wage will be the greatest factor in determining the likelihood of attrition. Making money to support themselves, their family, and their lifestyle is the main purpose of most people’s employment. As such, the logical assessment would be that wage carries the greatest weight in predicting whether or not an individual will leave their role. Beyond this initial hypothesis, other interesting questions arise: Will an employee’s background or overall satisfaction have as great an impact in their choice to remain with or leave a company as income? If so, what factors will prove to be the most important?
Figure 1 showcases the distribution of employee ages for both the attrition and non-attrition groups. The two distributions follow a similar trend, as the majority of employees surveyed were in the 20-50 year old range. This curve is expected given that younger people may still be in school, while older people will begin to retire past a certain age. A clear gap in the average age of individuals who did or did not leave their role can be observed. The attrition group average is roughly 5 years younger than the non-attrition average. The attrition distribution is centered around 30 years old, while the non-attrition distribution is roughly 5 years older at 35. This seems to indicate that younger people tend to leave their companies more often than somewhat older people.
Figure 2 provides more insight into the quantity and percentage of attrition by age group. Similar to Figure 1, Figure 2 shows that the percentage of attrition starts high for the youngest group, bottoms out in the 35-55 year old range, and starts increasing again in the final age group.
The next classifier to consider is gender. Figure 3 showcases the distribution of male and female employees surveyed, with respect to age. It appears the IBM data scientists did a decent job getting a near matching distribution of males and females by age for this survey. This is key to deriving a one-to-one comparison of attrition between the genders at any age group.
Figure 4 gives clarity on the proportion of men and women surveyed for this dataset. While Figure 3 proves that the distribution of each gender is very similar, Figure 4 shows that significantly more men were surveyed than women. However, the overall attrition percentages for men and women are about the same, with only 2% more men than women choosing to leave their role around the time of the survey.
Figure 5 showcases the slight difference in attrition pattern by gender. Both genders follow a declining attrition pattern in the earliest years and are nearly identical until age group 35-40. It is worth noting that the sample size is much smaller in the first two age groups (as seen in Figure 2), which may account for the disparity in data between genders. Once the data reaches the 35-40 age group, the gender data no longer follows the same pattern. Males continue to follow a smooth decline in attrition that is only interrupted once the earliest retirement ages are reached in age group 45-50 and on.
On the other hand, females follow a much more complicated pattern, where attrition percentage is lowest in age groups 35-40 and 50-55 but bumps up to levels more similar to the men in the other groups. Without more specific data to back any claim of reasoning for this, one may interpret the low female age 35-40 attrition to traditional family roles in which the mother is the primary caregiver for young children. Taking on this role, many women may look for stability in their careers and are therefore dissuaded from leaving their position.
The dataset contains 5 levels of education representing the highest level of education achieved by each employee:
- 1: Below College
- 2: Associate’s Degree
- 3: Bachelor's Degree
- 4: Master’s Degree
- 5: Doctorate
As can be seen in Figure 6, attrition is mostly even across all education levels. There is just a slight decline as education level increases. The only level that stands out from the rest is the doctorate education level. The first pie chart in Figure 6 shows that only 3% of employees surveyed had a doctorate. It is possible that roles that require doctorate degrees are so rare, or perhaps so specialized, that an individual would seek to acquire their doctorate for the specific purpose of qualifying for a particular role. Thus once the individual reaches their end goal and is employed in the position or field of their choosing, they are more likely to remain there for the long term.
Figure 7 provides slightly more insight into why attrition slowly decreases with education level. It is not surprising to see that the lowest level of education, below college, has the earliest attrition peak. One can surmise that as uneducated individuals age, they are more likely to return to finish their education and thus elevate themselves to higher status positions. This holds true for each subsequently higher level of education (except associate’s degree).
Figure 8 and Table 1 provide information on how commute length affects an employee’s decision to leave their role. Both show a clear trend where employees with higher commute lengths are more likely to leave their role. This is not surprising, as a longer commute would mean more travel time as well as a higher cost of travel for the employee, which cuts into both income and free time for the employee. Employers looking to open additional office space may want to seriously consider their next location to find a spot close to large population centers or easily accessible via major highways or public transport systems. Such locations are more appealing to employees than those that involve a more arduous commute.
Figure 9 does not show any convincing trend that the requirement of rare or even frequent business travel has a strong impact on an employee’s likelihood to leave their position. On first consideration, one might think that frequent travel would have an impact on a person’s decision to remain with a company, as having to be always on the move could be considered a negative aspect of a job to many people. This thought process then coupled with the results of Figure 9 would suggest there might be another hidden factor at play here, such as additional compensation. However Figure 10 shows it can be seen that the frequent travel category actually has the lowest average monthly income. It appears the trend in Figure 9 can be trusted, and that there is no discernable connection between requirement of business travel and employee attrition.
Figure 11 exhibits a strong relationship between the requirement of overtime and employee attrition. As a healthy work-life balance is very important to many, it is not surprising that being required to work beyond the standard 40 hour work week is a motive for leaving a role for many employees.
The original hypothesis prior to beginning this project was that income would be the single greatest factor in determining an employee’s likelihood of remaining in their role. Figure 12 and Table 2 back up the prediction that lower wages would lead to more attrition. Every percentile of wage for employees who left their role is lower than their non-attrition counterparts. Clearly there is a correlation between wages and attrition.
The position outlook feature was created by combining the scores of five other categories related to an employee’s perspective on their role: environment satisfaction, job involvement, job satisfaction, relationship satisfaction, work-life balance. Each of these factors were scored on a 1-5 scale, with 1 being the lowest score and 5 being the highest. The position outlook feature is simply an average of these five scores.
Figure 13 and Table 3 support the hypothesis for employees with a higher outlook on the non-monetary aspects of their role, such factors would have discernible impact on their decision to remain at or leave their position. This should not be surprising for anyone who has had the chance to work for companies such as Google or Facebook, who are famous for creating fun, comfortable working environments for their employees. If an employee feels happier at their place of work, they will develop more loyalty there.
The insights that came to light from the research into this project prove that the hypothesis of wage being a critically impactful feature is, in fact, true. Employees who left their role had a much lower average income than their counterparts. Additionally, position outlook and the requirement of overtime proved to be highly correlated features to attrition. All of these features seem to be more important than most of the personal information categories. Age, gender, and education level either had no discernible correlation to attrition or were too weak/inconsistent to be considered a major factor for attrition. The only category investigated in the personal information group that showed a significant correlation with attrition was commuting time.
Employers can use the information found here to better prevent unnecessary loss in employee efficiency by taking a look at their own employment practices and comparing themselves to their competitors. By offering more competitive wages, as well as better employee amenities or services, a company can not only expect to retain more of their high level employees but should also expect to continue to make hires that will remain loyal to the company moving forward.
While the hypothesis was partially confirmed, it could not be fully upheld using the methods employed in this project. Additionally, many of the categories analyzed in this project may be susceptible to multicollinearity, such as the requirement of overtime and job satisfaction. To properly cross examine all of the features in this dataset by hand using the methods of this project would be impractical. If the multicollinearity could be accounted for, machine learning techniques could be utilized to discover much more information about this dataset, such as being able to predict an employee’s likelihood of attrition or the salary level required to attain a desired level of certainty of retaining an employee. A feature importance list could also be generated to give an answer to the second half of the original hypothesis.