Research and Development Performance in Select Countries

Yannick Kimmel
Posted on May 3, 2016

Contributed by Yannick Kimmel. He  is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between April 11th to July 1st, 2016. This post is based on his first class project - R visualization (due on the 2nd week of the program).
laboratory-385349_1920

I. Introduction

Technology is one of the few ways that can increase the productivity of a nation. New technology is produced from research & development (R&D). Understanding how technology and R&D is changing is important to policy makers, economists, and researchers. I personally became interested in how R&D is changing in the USA and other countries after I became aware of the lack of R&D jobs available in manufacturing while finishing my PhD. The code for this project can be found here.

World development indicators dataset from World Bank

The data was originally taken from the World Bank's World Developments Indicators dataset. It is a fantastic dataset with statistics on up to 247 countries and regions on 1,344 developments indicators. The yearly indicators range from as early as 1960 to as late as 2015. I used the transformed dataset available on Kaggle. The transformed dataset is clean, with columns for country name, country code, indicator name, indicator code, year, and indicator value.

There were several countries I was interested in their R&D output. I selected USA, Euro Union, and Japan because they are established high research performers. I also selected China, India, (South) Korea, and Russia as they are at least perceived as growing.

II. Results

Data was organized using R's dplyr package and graphed using R's ggplot2 package. The code used in this post is shown below:

Input: R&D expenditure as % of GDP

A good metric for how much R&D input of a country is the R&D expenditure as % of GDP. This normalization allows for the comparison of different countries. The figure below shows that the R&D expenditures of Korea and China are rapidly increasing, while Japan and USA are high and steady. It is interesting that Korea is now spends the highest out of the selected countries. Business Insider attributes this to Korea's desire to compete against China in quality rather than quantity.

R&D expenditures

R&D output: Journal publications

My first metric for R&D output was the number of scientific and technical journal articles published per year. This metric can be considered mostly an academic measure as publications are helpful for spreading ideas, but have limited financial benefits. This metric is not perfect, as the quantity of publications does not necessarily reflect the quality of the publications.  For this exploratory analysis, however, quantity of journal articles will be suitable. Both the Euro Union and USA publish a high amount (with EU increasing), while Japan is decreasing. The number of publications in China is increasing and is ranked third.

Scientific journal articles

R&D output: Patent applications

The number of patent applications as a R&D output is complimentary to number of published journal articles. This is because while published journal articles does not have direct financial benefits, patent applications do, and can be considered an industrial R&D measure. The rapid increase in the number of patents in China reflects the high interest in China in manufacturing from both foreign companies and Chinese nationals. It is also worth noting that the number of patents in the USA and Korea (a small country) are also high and increasing, while decreasing everywhere else.

patents

Number of Researchers in R&D

The figure below shows how the number of researchers are changing over time. Korea is rapidly increasing and Euro Union is also increasing while the rest of the countries are relatively flat. The increase in researchers match similar trends in increase of R&D output in Korea.

Researchers

Value added manufacturing

The last metric I was interested in was the percentage of GDP that was attributed to manufacturing. The figure below shows that manufacturing is a major part of China's economy, but is decreasing. Manufacturing overall has increased in Korea, but has dropped in recent years. Manufacturing is only a small percentage of the economies in USA, Japan and Euro Union and continues to decrease.

Manufacturing

World map of manufacturing in 2013

I wanted to get a better understanding of the importance of manufacturing in the world. I used the package rworldmap to visualize manufacturing across the world. From the map, it is easy to see that manufacturing is still very important in Asia and Central and Eastern Europe.

worldmapmanufacturing

III. Conclusions

China’s R&D expenditure and output rapidly increasing. The R&D in Korea is also increasing. While the USA shows strong R&D outputs. However, there discrepency between research and manufacturing output in USA and EU. This could explain why I noticed a lack of research jobs in the USA.

About Author

Yannick Kimmel

Yannick Kimmel

Yannick is drawn to solving a wide range of problems - from the traditional sciences to current challenges in data science and machine learning. Yannick holds a PhD in chemical engineering from the University of Delaware, and a...
View all posts by Yannick Kimmel >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

2019 airbnb alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp