Finding Data Trends in the Finance & Real Estate Sector
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Github | LinkedIn
Introduction
Over the course of the past year or so, data studies have seen many changes in the way of living due to the pandemic caused by Covid-19. In our work life, where once we were commuting however long to reach our office destination, we've now become adjusted to working from the comforts of our own four walls.
With the world slowly beginning to reopen itself, I was curious to understand the trends of the job occupations in the Finance & Real Estate sector over the past couple years. From a business perspective, there might be changes necessary to be made in order for the potential of a company to be at its most optimal position. By analyzing a data set of job occupations by share, I would like to determine if any changes need to be made to the structure of a company as we begin to see offices reopening.
Data Set
This data set, found on datausa.io , contained numerous information on Major, Minor and Broad Occupations along with Average Wages and Total Populations for each occupation branch. Containing 1,234 rows by 24 columns, there was a lot of data to oversee with this set.
Data Exploration & Cleaning
At first glance, there's a lot of unnecessary data in front of us that we don't need. My initial struggle with this data was figuring out what I needed and did not need. Originally, my intention was to only remove numerous columns that didn't seem prevalent to our analysis, but after looking through the dimensions of the Occupation columns and taking into factor the time I had to analyze this data, it was clear that I had to group by Broad Occupations rather than dissect each facet of the Major and Minor Occupations.
Where as Broad Occupation had a value count size of 21, Detailed Occupation had 301. So after grouping by Broad Occupation, removing columns that we didn't need and then aggregating the mean for the Total Populations and Average Wage of every Broad Occupation, we had a data set ready to work with.
Data Analysis
The initial observation was the common population trend for each Occupation. Among them, the average population of occupations with the largest count seemed to have a declining attribute to them over the years of 2014 to 2019 while most of the occupations with the smallest count had an incline in its population. When compared to the average wage mean, the common trend was an incline and this was expected (as wage typically increases with inflation.)
It was apparent that as the wages went up, the population went down and that was also expected. With that being said, there were still some interesting outliers that raised question. For example, why are Education, Training, & Library Occupations and Construction & Extraction Occupations's average wage mean on a decline over the year? Why are Healthcare practitioners & Technical occupations decreasing in population? Especially, when healthcare became quite crucial as we entered 2019. These were just some of the many questions that came to mind.
Conclusion
There were many insightful takeaways from this data analysis. Many common and also not so common trends are taking place when viewing the picture. What is causing those trends can't be determined with this information but what can be seen is where we should start keeping our eyes on.
From a business perspective, changes are made over the course of the years and in order to stay relevant and thriving, adapting to those changes is essential. There are evident signs of trends taking place and with those trends comes, for us as a company, restructuring so that a company can run as proficiently and smoothly as possible. Although we can't make acute decisions, possible considerations can be made from these observations and with new and more recent data I believe beneficial changes can be made to how a company structures itself during the reopening season post pandemic.
Challenges & Future Research
One of the biggest obstacles of this project was the lack of data in the most recent years. With a data set that contained information in the timeframe of 2014-2019 only, it was difficult to make any further assessment in regard to post pandemic culture. With time and more recent data, I believe that a more accurate and decisive approach can be taken in how the structure of a company reopening itself post pandemic could look. If time and resources allow, that will be my next step at analyzing this data set for a more solid inference.