Bookmakers vs Gamblers: How accurate are their predictions?

Posted on Nov 2, 2019
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Bookmakers tend to fare a lot better than gamblers, but just how accurate are they when it comes to their own predictions? Do they skew their odds to take advantage of certain natural human tendencies? In making this Shiny App, I set out to explore the betting data from 10 different betting agencies across 10 European football leagues from 2008 -2016. My aims were to find out how far their predictions deviate from reality and which are most successful in this respect. I also sought to provide a steer towards bookmakers with more generous odds and ideally find a chink in their armor for punters to exploit!

In my initial exploratory data analysis, these two pie charts were what really piqued my interest. There is a clear disparity between how often there is a draw (25.3%) and how often the bookmakers actually predict one (0.132%). The odds offered for the away team winning is roughly in line with reality, so it's home wins that bookmakers tend to back in favour of a draw. Presumably they want to take advantage of a punter's tendency to favour an outright win and thus offer less generous odds. So I was intrigued to find out if targeting draws might give you better results. 

I next looked into the spreads for how much bookmakers lose on each successful bet. It appears that each successful bet on a draw is on average more expensive for the bookmaker, but the spread is much narrower than that of successful bets on a home or away win. Also, the bookmakers lose the least amount on successful bets on the home team winning, which suggests they are minimising their losses on the most popular option for punters. 

This next plot shows a tool which is interactive within the actual App. You can pick an agency and a scenario to place bets on, for example, home wins, and the chart will show you the percentage you would win or lose if you had placed a bet on the home team winning every single match in that league over 8 years.

This chart, for instance, show's you how much you would lose if you bet consistently on draws with Bet365. It seems silly as a realistic betting strategy, but I was interested to see whether it would fare any better than betting on home wins or away wins. It turns out there doesn't seem to be any advantage for betting on draws targeting draws. 

Having determined that such an approach didn't yield the desired results, I looked closer to see if there were any trends within each league to suggest parts of the season where the home or away team has more of an advantage. This was averaged across 8 seasons, so I didn't expect obvious trends. This was true of most leagues. However, the plot above shows the Scottish Premier League where there does appear to be a mild trend in the likelihood of an away win over the course of the season.

Finally, by way of a guide to any punters deciding which bookmaker to use, I traced how successful each bookmaker had been over the course of the 8 years in making successful predictions and also the average odds that each agency offered each year. There is clearly a trend towards agencies offering more generous odds in order to attract punters away from competing agencies. The most competitive odds are offered by Pinnacle and VC Bet.

This is very much an unfinished project. I hope to find data to look at this from the bookmaker's point of view and see which scenarios are most profitable and the extent to which punters prefer to bet on the favourites or underdogs. 

If you would like to check out the App to explore some of the interactive plots, please do: I have placed links below to the App, the dataset, my GitHub account and LinkedIn. 

Beat the Bookies App

My LinkedIn

My Github

Kaggle Dataset


About Author

William Ponsonby

William Ponsonby is a data scientist currently studying at the NYC Data Science Academy. Prior to that he studied Russian, Czech and Slovak at Oxford University and did internships in Investment Analysis, Accounting, Advertising and Self-Storage in London,...
View all posts by William Ponsonby >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI