Visualizing Data Trends in Primary Education

Posted on Aug 5, 2017
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Visualizing Data Trends in Primary Education


Data shows primary education is a fundamental requirement for success. Regardless of how one might define the term โ€œsuccessโ€, the skills attained in primary schooling are vital. Those of us who have grown up in a first-world country with universal access to primary school may take how important the early years of education are for granted.

The problem:

As of 2013, nearly 60 million children of primary school age were not in school1. While this figure is down from 99 million in the year 2000, clearly more action needs to be taken.


Given that the non-poor essentially have full access to primary education, we can safely assume that those who aren't enrolled are also poor.ย The gaps at a regional level are found in the poorer regions of the globe likes Sub-Saharan Africa and South Asia.ย These areas tend to be the poorest and are most affected by regional conflicts.

Dropout Data

Visualizing Data Trends in Primary Education

Using available data visualization tools within R like, we can graph specific indicators that were provided by the World Bank (link) to better portray our findings. In the graph above, comparing the cumulative dropout rate to the last grade in primary education indicates the proportion of pupils enrolled in a given grade who are no longer enrolled the following school year.

Enrollment Data

Comparing GNI per capita using the PPP (Purchasing Power Parity) method, we can find Sub-Saharan Africa and South Asia lagging behind the rest of the world. In the next graph, we can then look at the net enrollment rate and find these two regions at the bottom end of the scale as well. This reaffirms our assumption that poorer countries and regions are most likely to lack access to primary education. While trends have shown some improvement, there is much more work to be done.

What can we do?

Now that weโ€™ve identified where the problem areas are physically, we can do a little more digging to understand some of the factors leading to the success (or failure) of primary education.

Using Pearsonโ€™s method of correlation, we can use R to calculate the covariance of x and y divided by the standard deviation of x multiplied by the standard deviation of y. This will better display how various indicators correlate to one another. Essentially, we are quantifying the interdependence between two indicators in order to try and identify what drives primary education success.

Identifying Success

For the sake of the data we have available, we can define success of primary education as the literacy rate in the adult population. In addition to serving as an indicator of primary education success, literacy is an important factor in reducing poverty. A study by the World Bank linking education and poverty found that โ€œin all cases where detailed analysis of household data has been carried out, poverty rates are highest for households headed by illiterate people and decline with increased education of the household head.โ€2

The correlation plot identifies the drivers of increased (or decreased) literacy rates in the population aged 25-64. Enrollment ratios and GPI (Gender Parity Index) have a positive impact on Literacy rates, while Pupil to teacher ratios have a negative correlation. Gender parity displays the access of females to males in terms of education. The closer to 1, the more equal access is. Gender parity is especially helpful when looking at certain countries and regions that may not prioritize female education.

Additionally, we note the dropout rates rising with the pupil to teacher ratio. In other words, as we have more students per teacher, they are less likely to succeed and stay in school for the following year. This leads to another cycle of illiteracy and, in turn, poverty.


If we want to get serious about tackling the lack of primary school to those in need, we must use data and data visualization tools to shed light on the issues. Itโ€™s time to end the cycle of poverty, and we can do so by providing basic primary education to all children. Ensuring that students are not vastly outnumbering their teacher proves to be an important factor to keeping students enrolled and improving literacy rates. If we can collect more data, we can answer the questions we need and raise additional ones we should be asking.

Further research shows that the highest returns in less developed countries come from primary education3. We must prioritize our future generations and give them a fighting chance to attain something better.

A childโ€™s success in life should not be dictated by the region in which they are born.


Link to my ShinyApp.

Link to my GitHub repository.






About Author

Mike Ghoul

Mike is a strategic analyst with 5 years of financial services experience coupled with data science skills and an insatiable drive to solve problems. While at Morgan Stanley, he built predictive compensation models forecasting future costs and presented...
View all posts by Mike Ghoul >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI