Scraping Mobafire for the best champion.

Posted on May 21, 2017

Introduction to Mobafire

E-sports have gained a lot of popularity in the last few years. League of Legends (LOL) is one of the highest online played multiplayer online battle arena (MOBA) games in the world. LOL has a community following where the best players (called summoners) compete in world championships across the globe. This has resulted in community based champion build sites where ranked summoners post ways to build a champion. Mobafire is one of the most popular and referred build site on the internet that collects all the data from players around the world.

Background

LOL includes a number of maps, but the most played one is Summoner's Rift. Summoner's Rift is a match between two teams, blue team and red team, that are set to destroy the opponent's building called the Nexus.

Fig 1: The map to the left shows an aerial view of Summoner's Rift while the map of the right is a simplified version of the same map. 

Each team selects 5 out of 136 champions. Every champion has a type that can be played in one of 3 lanes or the jungle.

  • Top lane: Played by champions with a high portion of health or a combination of health and damage.
  • Mid lane: Champions with high damage burst.
  • Bot lane: Populated by two types of champions. Champions with crowd control (cc) and damage, but low health, and support champions. Support champions usually have high levels of health and utility. Their prime role is to give health to the bot champ, help him/her escape from ambushes and make plays for the bot champs.
  • Jungle: Between the three lanes there is a vast amount of space where the jungle champion kills jungle monsters to gain gold and buffs. The jungle champion's additional role is to harass opponent champions in each lane so that friendlies can farm more and make purchases.

Scraped data source and format

The data for each champ was scrapped from an LOL champion building site called Mobafire. Mobafire is a community driven site where summoners have tested the best way to build their champions. The best champion builds are usually upvoted by the community. Mobafire has collected stats about each champ that include:

  1. Positions played
  2. How popular that champion is for each position
  3. The win rate for that position.
  4. A 3 point scoring system based on five abilities of the champion. This is Mobafire own scoring system for each champion.

The scraped data includes:

  1. Name
  2. Alias
  3. Position 1
  4. Pick rate 1
  5. Win rate 1
  6. Position 2
  7. Pick rate 2
  8. Win rate 2
  9. Damage
  10. Toughness (health)
  11. Cc
  12. Mobility
  13. Utility

*The colored points are Mobafires stats for each champion.

How the data was scrapped

The stats for each champion were not in the form of an HTML table; rather they were embedded in HTML attributes.

mobafire champion

Scraping this data was easier and the xpath code name = Selector(text=row).xpath('//div[@class="champ-list__item__name"]/b/text()').extract()

However the Moba stats were embedded in CSS attributes. As seen in the picture below, each slice of the circle represents one of the five properties of a champion.

mobafire stats

Luckily, Selector has a CSS scraper.
damage=Selector(text=row).css('div[class="radial-stats"] ::attr(rating)').extract()[0]


Hypothesis

What is winning a match dependent on? Is it the champion's popularity or versatility?

Basic Stats

Distribution of champions

Pos 1 pie chartPos 2 pie chart

Fig 2. The pie chart on the left shows the distribution of champions that only have a primary role. The one of the right shows champions with additional, secondary roles.

Versatility

Fig. 3.  The stacked bar plot shows which position champions have the most versatile roles.

Based on Fig. 3, we can see that champions that have the highest versatility are top lane champions followed by mid lane champions.

Popularity

Fig.4 The boxplots show the popularity (pickrate1 and pickrate2) for a champion's first and second role (pos1 and pos2).

From pick rates we can infer that champions with higher pick rates are popular among users. Fig. 4 shows that most LOL champions are selected mostly for their primary roles (left); secondary roles were not that popular (right).

Popularity vs. versatility

In determining whether games were won based on champion popularity or versatility, heat maps were generated to see if the popular champions regularly won the match for both positions although in Fig. 4 there might be subtle hint that versatility does not contribute to winning a match.

Fig. 5 The heat map show the popularity of champions and their win rates for their primary and secondary roles.

Fig 5 shows a clear indication that popularity or versatility of a champion has little to do with winning a game. The highest win rate recorded in both positions is 57%. In spite of 57% being the highest win rate, the popularity for that champion was 80% and 13% for its primary and secondary role.


Conclusion

In an ideal situation, a highly popular champion (90% and above ) would have a very high win rate (80% and above) for both positions. We can clearly see that the reality is not even close. These analyses show that picking a popular or versatile champion does not lead to winning the game. There are other variables, such as team build up and individual champion stats, that contribute to winning.

LOL expects all champions to keep on leveling up until they reach level 18. It is also expected that in the process of reaching level 18 a user would have the necessary items to make their champions stronger than their enemy team. According to Mobafire, there are five categories that can help scale the champion up. Based on these factors, the faster a user can increase the damage output, the more winning chances the team has.

If the 3 point scale of Mobafire are formulated such that damage is the most important determinant, we can derive the following formula:

The formula was utilized for all champions and can be seen in the shiny app The app can be used to potentially determine which champion would win in a 1 vs 1 fight. The results were fairly accurate and were extended to team match ups.

About Author

Tariq Khaleeq

Tariq Khaleeq has a background in Bioinformatics and completed his masters from Saarland University, Germany. In his master thesis, he worked on prediction of non coding genes in breast cancer. After his masters he co-founded a company where...
View all posts by Tariq Khaleeq >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI