Clicks and Grades: Relationship Between Student Interactions With A Virtual Learning Environment and Outcomes

Nathan Stevens
Posted on Oct 22, 2016


The recent growth in online courses has led to an increase in learning  options for both traditional and nontraditional students.  In these courses, students use a Virtual Learning Environment, VLE, to simulate the experience of a real world classroom. Aside from the obvious benefits for students, one that is just now being explored is the ability to assess, in real time, the level of interaction with the course material.  Recent work has shown that such information can be used to spot students in danger of failing and intervene in a timely manner. Such interventions, however, require human resources which are costly and not scalable. As such, analytics data from the Online University in the United Kingdom was analyzed to see if there are any clear differences in usage between passing and failing students. The hope is that such information can be used in the creation of an automated tool to help students improve their overall grades.

The Data

The OULAD dataset, as it's known, consists of 7 csv files totaling more than 450 MB in size. There is information for seven courses, each running from February to October. And there is two years worth of data is available. Student demographics and time-coded student interactions with the VLE make up the bulk of the data and were the focus of the analysis.


studentaccessclicks_aaa_2013j studentaccessclicks_aaa_2014j

The above histograms show the breakdown for student access, clicks, for the same class held a year apart. Clear from the plots, is that most students show between 0 and 2,000 clicks. So are there differences between the number of clicks of the best students and those that failed?

Content Access Time


Not only is the number of clicks per student important, but also the mean time they take to access content relative to each other. Where most students seem to take similar times to access content, a smaller percentage either take very long or very little time. The question is do the student who take the least time do the best?  Likewise, do the students who take the longest end up failing the course?

Clicks and Grades


The relationship between the number of clicks and the overall student outcome can be seen in the scatter plot above. Students who failed, green dots, exhibit much less interaction with the VLE versus those who passed. Unsurprisingly the best students, on average, were accessing the VLE more than anyone else.

Mean Access Times


From analyzing the mean access time scatter plot, students who pass or do well have a much narrower range of access times vs. the student who failed. As a matter of fact, students who failed were accessing the content earlier than those who passed which was a bit surprising.  One would assume that the earlier a student accesses content, the better they would likely do.

Age and Gender

studentscore_aaa_2013j-05 studentscore_aaa_2013j-06

When the level of access by age category is compared, the results were unexpected. One would assume that younger students (< 35 years), being "more" tech savvy, would be accessing the course content more than the older students.  This isn’t the case according to the data. Students in the 55 and over category were the ones using the VLE the most. One possibility for this is that younger students may be saving the content to their local devices so they have less of a need to continually interact with the VLE.  As expected, there seems to be little difference between the interaction levels between men and women, though statistical testing might show otherwise.

Conclusion and Recommendation

Students who interact more with the VLE tend to perform better.  In contrast, students who access the materials early tend to perform worse.  Interaction level also seems to be age related, but gender doesn’t seem to play much of a role. From these results, a data driven widget could be added to the VLE to encourage students to interact with the VLE in meaningful ways based on their current grades and demographics. This approach would be both scalable and individualized.

About Author

Nathan Stevens

Nathan Stevens

Nathan holds a Ph.D. in Nanotechnology and Materials Science from the City University of New York graduate school, and has worked on numerous software and scientific research projects over the last 10 years. Software projects have ranged from...
View all posts by Nathan Stevens >

Leave a Comment

Analyst May 2, 2018
Nice piece of analysis for the data. Can you please advise which tool you used to create these graphs as they look great and easy to interpret.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp