Topics from TED Talks

Avatar
Posted on Feb 4, 2019

The company TED Conferences LLC posts talks online for free distribution and they have been watched billions of times worldwide since the site was launched. The purpose of this project is to study how the topics of these videos changed over the time.

This is a web scraping project and all information was scraped from ted.com/talks using Selenium. More than 3 thousands videos were scraped and the items extracted from each one are: title of the video, description of the video, keywords (or related topics), month and year that the talk occurred, number of views, number of transcript languages and number of comments.

You may find the code file here.

Data Analysis

Each video has several keywords related and there are more than 4 hundreds different topics. At the scatter plot below, it is possible to see the distribution between the number of videos and the average number of views for each topic.

Graph 1: Number of videos vs average of views for each topic

It was a surprise to see from Graph 1 that the topics with more videos posted is not the most ones watched. To study the evolution of the topics over the time, it was taken these two extremes separately.

Following it is possible to see the progress of the topics with more than 400 videos posted over the time.

Graph 2: Topics with more than 400 videos posted

The topics most posted are, in descending order: technology, science, culture, global issues, design, business and society. Technology was on the top of the most posted topics until 2016, when society had a significant increase in the number of videos. It might suggests a change in TED's repertoire.

Following it is possible to see the evolution of the topics with more than 4 millions views over the time.

Graph 3: Topics with more than 4 millions views

The most viewed topics are, in descending order: body language, introvert, mindfulness, success, time and evil. Success and time are two topics that can be a strong relation. So, it is not possible to see a significant change in the videos with more views.

Another factor studied is the relation among the number of view, number of transcript languages and number of comments. The data for the Graph 4 is from the last 5 years and the size of the circle represents the number of comments.

Graph 4: Relation among number of views, number os transcript languages and the number of comments

From Graph 4 it is concluded that the number of views might increases when the video has more transcript languages but it is not possible to have a clear conclusion about the influence of the number of comments.

Future Work

For future work , studies of the network among the topics and analyze the central ones would be helpful to have a wider view of the evolution of the topics.

Scraping more comments information (as comment date, replies and helpful ratings) would be interesting to analyze how the people interact with video or topic. In other words, if there are more interaction in new videos than the older ones for example.

Conclusion

TED posts more talks about technology but it seems that people are more interested in videos related with success/career. Besides that, the number of transcript languages has a positive affect on the number of views.

About Author

Avatar

Stella Oliveira

Data scientist with a background in financial services and demonstrated experience managing data and deploying predictive models. Highly motivated to combine the ability to thrive in a fast-paced work environment with the fascination for generating insights from complex...
View all posts by Stella Oliveira >

Leave a Comment

Your email address will not be published. Required fields are marked *

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags