Scraping Carousell.ph

Ira Villar
Posted on Nov 6, 2019

The Philippines is a beautiful country. I may have been born in the US, but this is still my country. Its beaches are vast, valleys pristine, and mountain ranges so beautiful it'll make you cry. The same thing can't be said about Manila.

The heart of the city is full of the urban poor. There's a great class divide and majority of the population live in poverty and hopelessness. Buying things at full price just isn't an option. 

So what is one option to take?

Carousell.ph is the premier buy and sell website all over south east Asia. They have branches in Malaysia, Singapore, Taiwan, Hong Kong, and even in Australia. For my scraping project I decided to scrape this site so I could compare values and resale values for various items on the page. 

My original hypothesis is that the page is popular due to their collections of high quality and affordable fare. So i took a deep dive into the website.

 

The first step I did was to get the mean or average prices of the thousands of results I was able to scrape from the page. From there I noticed obvious outlier categories when it came to Real Estate, Cars, and even Antiques so I took them out and filtered a smaller group of categories I could compare.

This gives a better look at the categories. While motorbikes are more expensive, surprising follower included car and business services as well as the assistive category which includes wheelchairs, canes, and such. 

 

Looking at three random categories, I chose health and beauty as well as car parts and photography. What's interesting to note is that some of the highest sellers seem to be groups or companies. This can be seen with usernames such as "facebeauty.shop" and "snycustoms".

 

As expected, newer cars are generally more expensive that used cars. Although used cars are the majority of the cars (and even the most expensive one). 

 

I focused on analysis for mobile phones in particular.  In general, the prices and values seemed to be appropriate but there were certain trends that I found interesting. 

One 256 gb iPhone 11 cost 14 hundred dollars on Carousell while getting one straight from the Mac store would only cost 12 hundred dollars. 

An older model like the iPhone X was appropriately priced but I found the resale value inflated again when it came to Samsung phones. 

Samsung Galaxy Note10+s and Note10+5g phones were overpriced by a few hundred dollars as well. 

 

While the general trend is that used items are cheaper and new items are more expensive, it’s not necessarily true especially with higher end items such as latest model mobile phones and luxury vehicles that still cost a pretty penny even if used. There are resellers who possibly get early access to products only to resell it at higher rates (as seen with the iPhone examples). What used to be a simple person to person market has changed as discussed in the categorical analysis with various small groups creating their profiles for business related transactions. 

 

Future studies would benefit from point of sale and time data upon deal completion as well as more precise item assignments as opposed to user input for product titles. 

About Author

Ira Villar

Ira Villar

Ira is currently a Data Science Fellow at the NYC Data Science Academy. He has a bachelor's degree in Biology and Chemistry.
View all posts by Ira Villar >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

2019 airbnb alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp