Data Analysis of Used Car Listings on Cars.com
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Project Summary:
Cars, while they can be a big-ticket purchase, are not investments. Data shows their value drops the moment it is driven off the dealership lot and continues to do so without skipping a beat. Many variables factor into a car’s resale value; however, mileage and age stand out above all others.
To examine magnitude of these two factors’ effect on a car’s resale value and rate of depreciation, I used Selenium to collect used car listings’ data on cars.com. In order to filter out high-end and sports car makes, a price maximum of $60,000 was set. The year of the vehicle was also limited to 2000 – 2020. Listings in major US cities were scraped and the data collected from each included: Title of listing (Year and Make), Mileage, Exterior Color, Interior Color, Transmission, and Drivetrain.
By examining the used car market, we can derive information on the habits of the first owner of a car and the perceptions and sensitivities of the secondhand buyer. The data is further broken down by car make, categorized into “Luxury Brand” and “Non-Luxury Brand” as well as by the listings’ city to identify differences between them. Findings from the analysis would be useful to car dealerships by refining their marketing techniques to increase sales.
Initial Data Findings:
While examining the data as a whole, we see that the mode of age and mileage centers around 3 and 40,000, respectively. A multiple regression analysis was done on Price with respect to Age and Mileage showing that on average every year depreciates a car by roughly $590 and every five thousand miles driven depreciates it by roughly $645.
Price = -590.4*Age – 645.3* Mileage (5 thousand miles) + 29,303
After categorizing each listings’ car make as either “Luxury Brand” or “Non-Luxury Brand”, we see the used car market is about 34% Luxury Brands and 66% Non-Luxury Brands. By splitting our data by their respective cities, it becomes clear that their proportions vary from city to city with high cost of living cities (New York City and San Francisco) having a greater proportion of Luxury Brands.
Data on Differences Between Luxury and Non-Luxury
While the average sale price of a used Luxury Brand car is $10,000 more than a non Luxury Brand, the average age and mileage of either category similar. This suggest that new car consumers tend to behave similar regardless of the vehicle type they purchase. For car dealerships, it can be interpreted that optimal timing to market to a new car owner is just over 4 years after their purchase, regardless if they own a Luxury brand or not.
However, the depreciation rate does differ between car types. A luxury brand loses $1,027 in value for each year that passes while a non-Luxury brand only loses $676. For every five thousand miles driven, a luxury brand will lose $716 compared to $438 for a non-luxury brand. Car dealerships can relay this information to their salespeople to aid them in tailoring their sales pitch on why a potential buyer should sell their used car now for a new one.
Data on Differences Across Major Cities
When examining the data grouped by the cities the listing is located in, we see that the average mileage and age of a used car listing varies city to city. Car owners in Seattle tend to wait longer before selling their vehicle, while owners in Miami and San Francisco sell earlier. The data also shows that mileage of a vehicle increases the longer the owner waits to sell. However, Houston is an exception suggesting that the average owner drives more each year than other cities (this is definitely an area for more exploration).
After running multiple linear regressions on each subset of data, we can see how residents of different major cities have different levels of sensitivity to age and mileage exhibited by difference between coefficients. A car’s value in Boston decreases by $939 every year while it only decreases $160 in San Francisco. However, every five thousand miles driven San Francisco depreciates a car by $833 but only $601 in Boston. Furthermore, it can be concluded that vehicles in San Francisco seems to depreciate slower overall than anywhere else. Factors responsible for these differences may be weather conditions, road structures, and the economic status of the residents.