Using Auction House Data to Evaluate Classic Cars
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
GitHub | Other Work | LinkedIn
The used-car market has recently seen record breaking sales. According to the Manheim U.S. Used Vehicle Value Index data, used-car prices rose 5.3 percent in September 2021 and 27.1 percent from a year earlier.
Is this inflation also present in the classic car market?
If you want to know your cars value you would use something like Kelly Blue book (KBB); an online tool that appraises cars based on many factors. However, KBB does not cover cars older than 1992.
Identifying this gap, I set out to web scrape auction data from the largest auction house in North America. The group sells thousands of cars a year that are typically older than 30 years. From the data scraped, features included: year, make, model, origin, auction date, auction location, and if it sold or not. I got data on nearly 13,000 cars from auctions ranging from 2016 to 2021.
My first goal was to group the data by manufacturer. The luxury sports car maker Porsche, had a large volume in sales every year, so I picked them first. To accurately assess a cars value I had to target a specific year and model.
Porsche Data
I chose a 1973 Porsche 911 Carrera RS 2.7 Touring. The deviation between prices was great, those on the high end going for $750K and the low for $400K. The variation in price made it hard to say how accurate my predictions would be.
Austin Mini Cooper Data
Next, I chose an Austin Mini Cooper. These had lower deviations, ranging from $5K to $40K. After analyzing more cars I came to the conclusion that the more affordable saw less variation and so were easier to predict.
Data on The Origin of the Cars
Going further, I got the average selling price of car by what country they were made in. In the graph below, we can see all groups have had a large rebound from 2020, similar to the consumer used-market market.
After running analysis on more cars I concluded that for this to be accurate it would need much more data. Further, I would need to implement the code into an app (like Shiny) to setup a full user side experience. The accurate valuation of classic cars is possible but would take much more data and many more data features to distinguish a cars value.
Collecting this dataset was the first step in creating such a tool.