Python EDA Project : Salary Prediction.
Introduction:
Salary prediction is a crucial application of machine learning and statistical modeling that aims to estimate the salary of an individual based on various factors such as age, education, and experience. In addition, accurate salary predictions are valuable for job seekers, employers, and policymakers alike. The objective of this project is, therefore, to analyze the various factors that affect salary predictions. To achieve this, I explored this interesting dataset by posing two key questions:
- Which features are strongly associated with salary prediction?
- How do different features interact to affect salary?
By addressing these questions, I aim to uncover meaningful insights that can help improve the accuracy and usefulness of salary predictions in various contexts.
Data:
This dataset contains information about the salaries of employees at a company. Specifically, each row represents a different employee, and the columns include various details such as age, gender, education level, job title, years of experience, and salary, all set out across six columns.
It is important to note that this dataset is intended solely for educational use. Moreover, it was generated by large language models and not collected from actual data sources. Consequently, any commercial use of the dataset is strictly prohibited. For reference, here is a link to the dataset: https://www.kaggle.com/datasets/rkiattisak/salary-prediction-for-beginner
Exploratory Data Analysis:
Which features are strongly associated with salary prediction?
Based on the data analysis, Age, Years of Experience, Education Level, and Job Title are strongly associated with salary prediction. Specifically, more years of experience and a higher education level typically translate into higher salaries. Furthermore, these factors have been clearly illustrated and explained in the graphs below, which provide a more detailed visual representation of their impact on salary predictions.
Fig.1. Salary increases with years of experience and education level
How do different features interact to affect salary?
All features are positively correlated not only between each other but also with salary.
Fig.2. More years of experience accumulated correlate with higher income
Fig.3a. Distribution of Salary by Education Level
Fig.3b. Distribution of Salary by Education Level
The effect of education level on earnings is reflected in Fig.3a and Fig.3b. There is a general trend for higher degrees to correlate with higher salaries. But it's interesting that there are clear exceptions to this general rule. As the violin graph shows, some individuals with just the undergraduate degree earn more than those with higher degrees. That is likely due to differences in years of experience and area of study.
Fig.4. Correlation Between Years of Experience, Age and Salary
Generally, older workers have more experience, which typically correlates with higher salaries. People who spend more years on a job are assumed to have acquired greater skills and responsibilities. That’s why older workers are often the ones found in higher positions, such as management roles, which typically come with increased pay.
Fig.5. Job Title vs. Salary
Fig.5. shows that higher-level roles (e.g. CEO, CTO) typically command higher salaries. This reflects the fact that roles with higher responsibilities, as indicated by the C-level rank, or in-demand skills generally offer higher salaries.
Conclusion:
Employees aiming to maximize their earning potential should focus on gaining experience, acquiring relevant skills, and considering role changes to increase earning potential. Additionally, they can take advantage of predictive models and data insights to support salary negotiations, thereby demonstrating how their experience and skills align with higher salary ranges. Furthermore, leveraging industry benchmarks and networking opportunities can also contribute to a stronger position in salary discussions.
On the other hand, employers who want to be sure to set the right salary for the job should ensure that salary structures reflect experience, job responsibilities, and market trends. It’s also important to regularly review and adjust compensation in order to remain competitive. Moreover, to retain employees over time rather than risk losing them to competitors offering better salaries, employers should create clear career paths and development opportunities that align with salary increases. In doing so, employees will better understand how they can progress in their careers without having to leave the company.
Future work:
One of the factors that can also influence salaries is the location of the job. However, I had to leave that consideration out of this analysis. Ideally, I would have conducted a deeper analysis of salary predictions in different locations and their impact on business changes. Unfortunately, there are not enough features in this dataset to analyze the impact of this trend. Nonetheless, it is definitely something worth pursuing in future work.
In particular, among the questions I would seek to answer are the following:
- Which locations correlate with the highest salaries?
- Which locations correlate with the lowest salaries?
- Are the higher salaries consistent only with the increased cost of living in those areas, or are there other factors at play?
By exploring these questions, a more comprehensive understanding could be gained, leading to more informed conclusions.
Quick Links:
Featured Image by Freepik