Web Scraping Data on the Apple Mac App Store
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
There has been an increase in the use of Apple devices in the workplace. From tech companies to graphic and design fields, data shows Apple devices have increased in productivity and has made the workload easier for employees. In 1982, Apple's former CEO, Steve Jobs brought 19% of Adobes' shares which formed as a resource for individuals who work and study in the design field. Today, Apple has created thousands of apps that serve as a resourceful tool for individuals who are in professional fields.
Todays' laptop market has been partitioned by both Windows and Apple. The Mac App Store provides users with a variety of apps that can be beneficial to them in their professional fieled. As a start in enhancing my knowledge in the market research field, I explored seven categories ( Education, Business, Medical, Photography, Graphics & Design, Music, Video) in the Mac App Store. The objective of this web scraping project was to visualize which category will continue to improve this market.
Tools: Scrapy using Python
Using Scrapy, I scraped and iterated through each feature needed and formatted the information extracted into a dataset.
Information and Data Extracted:
- Name
- Size (GB, KB, MB)
- App Category
- Languages
- App Rating (0-5)
- Price
Cleaning:
Using the tools provided from the pandas' library in python, I was able to create functions that were able to clean the raw dataset.
Data Analysis :
How many apps are in each category?
- Educational apps tend to be the most predominant. Apple sells their Laptops/Ipad at a discounted rate for current students.
Size (GB, MB, KB) :
Which category consumes the most computational power (GB)?
- Graphic Design consumes the most computational power of apps, while business applications consume the least computational power.
Let's continue to explore a few of the categories.
- Each of the seven categories contains applications that vary within size.
- Based on users’ memory capacity, they are able to choose which application is best for them to use.
Price (Free / Paid) Data:
Paid Apps
- Majority of applications are not free.
- I picked the top ten most expensive apps in each category and compared their price ranges to the others.
- Majority of the paid apps are graphic and design applications. It seems to be that the applications that consumed the most computational power were not free.
- These apps are also the most expensive ones.
Free Apps:
- We all love free items. Majority of educational applications are free. Individuals are offered free resources to enhance their knowledge in their educational field.
Ratings:
- Now that we have looked at the prices and memory size, the rating is also essential in deciding what types of applications are worth to buy and download.
- Applications within the 7 categories contain more highly rated applications except for business applications.
Summary:
In conclusion, users within these seven professional fields have a variety of applications to choose from. Graphics and Design appeared to be the category that consumes the most memory power, tends to be the most expensive, and contains an average rating.
On the other hand, educational applications are at little to no cost and consume less computational power. After visualizing all of this information, it is fair to predict that both education and graphics & design applications are benefiting the apple market. As new applications are formed, an increase in profits is likely to occur.
Future Work:
- Scraping the total ratings
- Scraping the Windows Application store and comparing its features to Apple