Python Survey 2017 Visualization with R and Shiny

Avatar
Posted on Feb 4, 2019

Introduction

With the arrival of Big Data era, data has become more and more important to human beings. People who have better understand of data can not only gain advantage for their business, but also can have better understand of their industries. Because of this, data analysis tools such as python, R, SAS, etc. also become more and more popular. Since I have background in both computer science and applied mathematics and statistics. I decided to take a good look of survey that conducted by JetBean for python users in 2017. You can view my project via the link, and the code for the project is on github.

Dataset

The data set I used is from JetBean.com, the website provides the report about Python community in both 2016 and 2017. They only provide the raw data for 2017. It gives me a chance to doing analyzing with the data on different ways from their report. From the dataset, I hope to provide my audiences a better understand of python in the world now. The dataset includes answers from 10,000 JetBean users for the Python Developers Survey 2017. The survey has 30 different questions to ask users from if they use python as main languages to the type of industries the users are in, etc.

Project

After viewing and cleaning the data set, I decided to use 6 main components for my project. They are what is the usage of python for them? what countries do these python users from? What are the age ranges of these python users? What other languages do they use beside Python? What are the purposes do they use Python for? What kinds of industries do you work in?

As you can see in the graph down below, among all the users that finished the survey, there are 85.3% of python users. 67.5% of them use Python as main language; and 17.8% of them use Python as secondary language.

In the Country section, I listed of number of Python users in the top 12 countries that have most Python users and plug them into the global map. The darker color the countries get filled, the more Python users this country has.

In the Age section, I listed out the range of Python users from under 17 to 60 or older. Although Python users have different age, as you can see most of Python users are in their 20s to 30s. And age range of 21-29 has the most Python users.

For the question about what other languages do python users use, you can find the answer in the language section. You can clearly see that almost 50% of developer also use JavaScript, and 49% of them use HTML as well.

That leads us to the question of what do people use Python for? Since this question can contain multiple answer on the survey. The total percentage is greater 100%. The answer is most of them use Python for either Data Analysis or Web Development (50% vs 49%), following by DevOps / System administration / Writing automation scripts, Programming of web parsers / scrapers / crawlers, etc.

As I mentioned in the beginning of this, the world has entered the era of Big Data. You can easily figure out the answer for what kinds of industries do Python users work for? Information Technology / Software Development contains about 25% of Python users among 10,000 of them.

Conclusion

In the future I hope to find more dataset for the popular using computer language like JavaScript, Java, C, C# to continue my project. To provide people with better understand of different computer languages in the world nowadays.

About Author

Avatar

Weixing Yang

Data scientist with a background in big data analytics and intensive programming. I am currently seeking a position within a creative and dynamic work environment that gives me the opportunity to contribute my abilities and skill set gained...
View all posts by Weixing Yang >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp