Visualizing FIDE chess rating list

Posted on Feb 1, 2016

Contributed by Aravind Kolumum Raja. He is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between January 11th to April 1st, 2016. This post is based on his first class project - R visualization (due on the 2nd Week).

FIDE, FΓ©dΓ©ration Internationale des Γ‰checs or World Chess Federation is an international organization and one of the largest sporting bodies in the world ,connecting across 158 Federations across the world.Β  Across FIDE, the players are ranked according to the Elo System ( founded by Arpad Elo, a Professor of Physics and Chess Master ) . FIDE implemented the system since 1970 and has since remained the gold standard for rating players ever since.

The ratings system is designed such that the performance of a player is relative to the opponents played against(opponent's Rating)Β  and each Rating reflectsΒ  the cumulative results of all scores acquired over a period of time against various opponents.

The expected score of a Player (the probability of the player winning(along with halfΒ  the probability of drawing) is calculated from a logistic functionΒ  roughly indicates between players with an Elo difference of 400, the one with the higher rating will have a 90% chance of winning.Β  FIDE publishes its Elo rating list every month for players from all the countries across the world and this post analyses data from January 2016, the most recent list.

rating distribution overall

Women constitute only 11% of active population of players overall and the ratings seems to be normally distrubuted with a skewΒ  to the left.Β  Below is a rating density plot across sex for all players.

rating density plots across sex

TheΒ  difference in Mean ratings between Male and Female active players seems to be significant.Β  A two sample t-test for means was conducted which concluded the significant difference in mean ratings between the two categories.On average, there is a 200 point rating difference across sex.

Welch Two Sample t-test

data:  female and male
t = -65.427, df = 15707, p-value < 2.2e-16
alternative hypothesis: true difference in means is less than 0
sample estimates:
mean of F mean of M 
 1574.693  1771.426

FIDE awards lifetime titles to approximately the top 7% of all its players, of which Grandmaster(GM) is the most coveted .

masters

Similar titles WGM,WIM,WFM and WCM are awarded to women .The following chart shows the distribution of top FIDE Chess masters across various titles. You will overlaps across titles at some level, especially between IMs and GMs around the 2400-2500 mark and between IMs and FMs (2300-2400).

Distribution of Masters

The Russians without any doubt reign supreme when it comes to the number of titled players in the world.Β  Germany is another strong chess playing nation followed by Spain and USA.

Rplot11

An interesting observation from the data is the relationship between age and rating and its distribution across the (x,y) plane.Β Β Β  The followingΒ  plot shows that the density of ratings & age is spread across all levels and all ages, making Chess a very unique sport in this regard.Β  It is indeed very hard to determine a rating of a player based upon just knowing the player's age.

Universal across age

However, we do see a significant negative relationship between rating and age when we look at the set of Grandmasters.Β Β  You can make an approximate guess of a players Rating by subtracting 3.8 times the GM's age fromΒ  2676 .Β  At the highest level, there are no GM's participating in top tournaments after the age of 50.

Grandmaster ratings across age

A cause for concern is the stark difference between the age densities between Male and Female players.Β  TheΒ  plot below indicatesΒ  that there are hardly any women players above the age of 25.Β  It raises interesting questions that can possibly even explain the ratings difference across sex.It is possible that women are retiring too early and not pursuing the sport competitively as men do .Β  Another possible explanation for the sharp drop in the participationage for women could beΒ  societal in nature.

Demands on time, dueΒ  of cultural and societal expectations may result in this low participation rate. There is also the other explanation of women losing interest in the game after the age of 25 which seems highly unlikely in the context of the population spread across various countries.

age distribution discrepancy

When it comes to the percentage of females among the playing population, it may come as a surprise that east Asian countries like Vietnam,Mongolia and China have the highest percentage of females among federations. Denmark and Switzerland are among countries with the worst female ratios

percentage of females

femaleratiotop

The spread of players by age mirrors the demographic distribution of countries. Older players are found among theΒ  aging populations of Europe whereas the youngest players are emerging from the countries with more dynamic population growth such as Sri Lanka, UAE and Korea.Denmark and Switzerland seem to again feature in the top list , this time for the most aging active chess population.

Average Player age across the World

Β olderagelist

It was interesting to look into the frequency of GM's per capita in each country or roughly, the chancesΒ  that a random person you run into is a Grandmaster. Iceland and Armenia are among the top of the list. Armenia has always been known among the strongest chess playing nations in Europe .

.

Probability of Running to GM

GM_prob

Another observation was the confirmation of Ratings inflation over the decades amongΒ  Grandmasters . Below are two plots showing rating densities from 1975,85,95,2005 and present. The density curve has been shifting towards the right slowly but surely. The accompanying box plot shows the increase in outlier points and the shift in the 1.5x the Inter-quartile Range towards higher ratings across time.

density comparison of ratings data across years box plot variance increase across Rating

Variance tests for the pairsΒ Β Β Β Β Β Β  (1985,1995) ,(1995, 2005)Β  and (2005,2016)Β  with the alternative hypothesis that the ratio of the Variances is less than one was conducted and each of the p values wereΒ 0.04,0.04, & 4.3*10-5 respectively and suggest that the variances have moved slowly towards the right across the years. This suggests the need to consider deflationary factors while comparing ratings of top players across eras.

About Author

Aravind Kolumum Raja

Aravind obtained his Masters degree in Statistics from Columbia University in 2012 and is presently an Analyst with a global investment management firm based in New York. His primary interests are in Mathematics, Statistics & Machine learning. He...
View all posts by Aravind Kolumum Raja >

Leave a Comment

android apk download games March 23, 2016
They are really good for the purpose they were created for, e-mail, surfing the internet and using a long life cycle of battery android apk files free download and crime was punished and pay with great force cash very little protest Android apk Downloader App If you don't eat and you live on the streets this figure significantly smaller android apk games 2015 which is rolling around inside my head because android apk games rpg

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI