Clicks and Grades: Relationship Between Student Interactions With A Virtual Learning Environment and Outcomes

Posted on Oct 22, 2016


The recent growth in online courses has led to an increase in learning  options for both traditional and nontraditional students.  In these courses, students use a Virtual Learning Environment, VLE, to simulate the experience of a real world classroom. Aside from the obvious benefits for students, one that is just now being explored is the ability to assess, in real time, the level of interaction with the course material.  Recent work has shown that such information can be used to spot students in danger of failing and intervene in a timely manner. Such interventions, however, require human resources which are costly and not scalable. As such, analytics data from the Online University in the United Kingdom was analyzed to see if there are any clear differences in usage between passing and failing students. The hope is that such information can be used in the creation of an automated tool to help students improve their overall grades.

The Data

The OULAD dataset, as it's known, consists of 7 csv files totaling more than 450 MB in size. There is information for seven courses, each running from February to October. And there is two years worth of data is available. Student demographics and time-coded student interactions with the VLE make up the bulk of the data and were the focus of the analysis.


studentaccessclicks_aaa_2013j studentaccessclicks_aaa_2014j

The above histograms show the breakdown for student access, clicks, for the same class held a year apart. Clear from the plots, is that most students show between 0 and 2,000 clicks. So are there differences between the number of clicks of the best students and those that failed?

Content Access Time


Not only is the number of clicks per student important, but also the mean time they take to access content relative to each other. Where most students seem to take similar times to access content, a smaller percentage either take very long or very little time. The question is do the student who take the least time do the best?  Likewise, do the students who take the longest end up failing the course?

Clicks and Grades


The relationship between the number of clicks and the overall student outcome can be seen in the scatter plot above. Students who failed, green dots, exhibit much less interaction with the VLE versus those who passed. Unsurprisingly the best students, on average, were accessing the VLE more than anyone else.

Mean Access Times


From analyzing the mean access time scatter plot, students who pass or do well have a much narrower range of access times vs. the student who failed. As a matter of fact, students who failed were accessing the content earlier than those who passed which was a bit surprising.  One would assume that the earlier a student accesses content, the better they would likely do.

Age and Gender

studentscore_aaa_2013j-05 studentscore_aaa_2013j-06

When the level of access by age category is compared, the results were unexpected. One would assume that younger students (< 35 years), being "more" tech savvy, would be accessing the course content more than the older students.  This isn’t the case according to the data. Students in the 55 and over category were the ones using the VLE the most. One possibility for this is that younger students may be saving the content to their local devices so they have less of a need to continually interact with the VLE.  As expected, there seems to be little difference between the interaction levels between men and women, though statistical testing might show otherwise.

Conclusion and Recommendation

Students who interact more with the VLE tend to perform better.  In contrast, students who access the materials early tend to perform worse.  Interaction level also seems to be age related, but gender doesn’t seem to play much of a role. From these results, a data driven widget could be added to the VLE to encourage students to interact with the VLE in meaningful ways based on their current grades and demographics. This approach would be both scalable and individualized.

About Author

Nathan Stevens

Nathan holds a Ph.D. in Nanotechnology and Materials Science from the City University of New York graduate school, and has worked on numerous software and scientific research projects over the last 10 years. Software projects have ranged from...
View all posts by Nathan Stevens >

Leave a Comment

Analyst May 2, 2018
Nice piece of analysis for the data. Can you please advise which tool you used to create these graphs as they look great and easy to interpret.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI