Secret to Winning a League of Legends Game

Posted on Aug 22, 2016

Induction

As a League of Legends fan and data scientist, I never give up combining those 2 things I love together. In this project, League of Legends game data was collected with a well-structured scraping framework, to support the further analysis and exploration, and about 35,000 rows match data of more than 400 players were scraped, the dataset is consisting of original data and features created with feature engineering based on my gaming experience, covering data of player's information, her or his game performance statistics, and so on. With the this pretty informative dataset, I not only the made game result predictive model, but also made pre-game strategy analysis function and a auto-break-up system dating reminder function for helping users built desired romantic relationship and find the right person earlier.

The data source website: na.op.gg

Data Structure and Scraping

There are 10 players in each game, and they form 2 teams to fight each other. The website gives almost all information on player level for each hame she or he had in 2 months.

Figure 1 - game information sample from na.op.gg

Don't panic if you are not familiar with those game terms. Please assume that we are predicting the result of a fight between 2 teams of people. For each person in this fight, let's say we exam her or his following features (speed, strength and intelligence):

2

Figure 2 - table of a fighter's tests

If the number of tests is large enough, we are confident to use the average value to indicate the 'fight power' of each person, and then get the 'fight power' of teams, and finally utilize forecasting models to predict the result of a League of Legends game.

For scraping the website, Python package Selenium was used because the website require visitor to click the 'game details' bottom to display full information of a game. By using this package, we could simulate real users' behavior so that we can access to some contents otherwise could not be scarped with other packages like beautiful soup.

The scrapping process starts from visiting the page of a player, and:

1) Refresh the information by clicking the 'renew' button.

2) Choose the game type 'normal' by clicking the drop-down list.

3) Collapse all game tabs all information in the table will be scraped.

4) Open another web page on which we can scrap information of each player's champion preference.

3

Figure 3 - the tricky parts of scraping

For each player, about 80 game records will be collected for estimating her or his game level, and for a 10-player game, it takes about 35 to scrape after I optimizing script to make it well-designed for collecting data effectively and flexible enough to overcome problems resulted from the complex html structure of this website.

Feature Engineer

To make better prediction, four features were made from original dataset, based on the understanding of this game, they are as follows:

Champion Win Rate: This feature is the player's win rate when she or he plays a certain champion, which brings the analysis to the level on champion.

Champion Top Number: The website has a list for each player, indicating how well a player can do with a certain champion, by reordering the list, I got a feature that precisely depict to which extent a player is good at a certain champion.

Game Frequency: How many games does one player played during a period. This feature was made because the belief that a player have to play a champion to keep she or he familiar with it, like you have to keep programming to ensure that you are good at it.

Champion Frequency: How many games does one player played with a certain champion during a period.

Modeling

Screen Shot 2016-08-22 at 2.08.02 AM

Figure 4 - the summary of logistics regression model

Figure 4 illustrates the importance of each feature, we could see that the Champion Win Rate is the most significant feature for predicting the game result, which beats all original feature. Another created feature Champion Frequency is also with strong predictive power. We can draw a conclusion that predict the game result on champion level (drill down features onto champion level) will offer a great model. What's more, by applying this model to test data, we get 86.1% accuracy.

Have More Fun

Pre-game strategy analysis

With the model, we know that what's important to win a League of Legends normal game, and we are able to get opponents' information before the real game starts, with scraping. So, we will know on which opponent we should focus, and an indeed wiser decision could be made if we know enemies better.

Dating reminder

It is noteworthy that there is a button named 'Living Game' next the refresh button, which allows user to check the status of a player, and we scrapers can play with it.

By using the Python googlemaps package, we could get how long does it take to go from place to place. So, we could automatically remind our date with another package which will sends e-mail for you.

Let's say a girl is going to have a dinner with her game addict boyfriend at 5:00, and it takes 1 hour to be the restaurant. She tells the function the date time and destination. At 3:20, my function helps her find that her boyfriend still in game, so an e-mail will be sent as follows:

Dear John,

It's 3:20, and YOU START A GAME!

the average length of lol game is 40 mins

you have 50% chance to be dumped. so quit.

or date a girl who doesnt know data science next time 🙂

We can have tons of fun with scraping, can't we?

About Author

Shuheng Li

With the intention of becoming a great Data Scientist, Shuheng is an out-box thinker and self-motivator who focuses on Machine Learning and Statistics, and seeks for challenging projects and competitions that push his skills to an advanced level....
View all posts by Shuheng Li >

Leave a Comment

Google June 17, 2021
Google Here are some hyperlinks to web sites that we link to due to the fact we assume they're really worth visiting.
Google March 25, 2021
Google Please go to the web pages we follow, including this a single, as it represents our picks in the web.
Google November 11, 2019
Google Always a large fan of linking to bloggers that I like but do not get a good deal of link like from.
Google October 4, 2019
Google We prefer to honor quite a few other world-wide-web web pages on the internet, even though they aren’t linked to us, by linking to them. Beneath are some webpages really worth checking out.
MOBA Champion Admin September 23, 2018
Hey Shuheng, you may find our project interesting: https://www.mobachampion.com We have developed a Match Prediction algorithm that everyone can use for free on our site.
https://storify.com October 10, 2017
Thank you for blogging this. I've been a long time reader of your upload.
storify.com October 8, 2017
Thanks for writing this. I've been a long time reader of your blog.
directleague.com October 8, 2017
I enjoy reading your league content. I've been a long time follower. Cool new post.
Dick September 13, 2017
Thanks so much for uploading this. I've been a long time follower of your uploads.
directleague.com September 13, 2017
I always enjoy reading your league uploads. I have been a long time reader. Cool blog post.
Tina August 17, 2017
I always enjoy following your league of legends posts. I've been a long time follower. Nice new post.
directleague.com August 16, 2017
I like reading your lol blog posts. I'm a long time reader. Nice new post.
click here July 19, 2017
Pretty! This was an extremely wonderful article. Thanks for providing this info.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI