Breaking Down the Elements of a Kickstarter Project

Regan Yee
Posted on Nov 3, 2016

Got an idea for a great project but lack the funds to execute it? Want to spread the word about a project you think the world should know about? Want to invest in new innovative products or meaningful causes? If you answered yes to any of those questions, Kickstarter is the site for you. With the thousands of Kickstarter ideas pitched on the site, how does one cut through the noise and present a project in a way that makes it likely to be funded and successful? Why do some projects such as the infamous Potato Salad Kickstarter get over-funded while some projects never get touched?

o-potato-salad-facebook

How does this get more backers than your great idea?

 

To explore such questions, I decided to make a Shiny app with a Kickstarter data set scraped by webrobots.io. The idea behind using this data set for a Shiny app is so that end users can explore whatever categories interest them within Kickstarter via some visual EDA tools. The data set is scraped on a monthly basis and my app is currently using the data set from 10-15-2016. Although the data set contains all projects from many different locations around the world, I limited the scope to Kickstarter projects within the United States in order to focus on finding meaningful insights within one geographic location. You can find the latest version of the Shiny app here.

When we reach the landing page of the app, we see some high level statistics about the entirety of US Kickstarter projects. Examining the landing page of the app, more generic categories have more Kickstarter ideas.

screen-shot-2016-11-06-at-5-50-10-pm

As we can see, the top 5 categories are music, film & video, publishing, art, and technology. As we go down the category list, we see more specific categories; comics and photography, which have 2.6% and 1.7% respectively, could technically be considered art, which has 10.2% of all projects. Journalism, which is 1% of all projects, can technically be put into publishing, which has 12.5% of all projects. Breaking down the projects by category, there are more project ideas for creative subjects (i.e. music, film, publishing, and art). Conversely, there are fewer Kickstarter users who try to fund their technology ideas. One can speculate that users with tech ideas may go towards more traditional venture capitalists or think that their ideas require too much funding to be done on Kickstarter.

Now that we have a view into the overall usage of Kickstarter, let's breakdown Kickstarters by geographic region. By going into the "Crosstab" section of the Kickstarter Explorer Shiny app, any user can break down the number of Kickstarter projects by state and category. Before doing any analysis on this, I had a preconceived notion that New York would have more art and film projects whereas California would have more tech project ideas. Let's look the actual breakdown:

screen-shot-2016-11-06-at-6-55-18-pm

screen-shot-2016-11-06-at-7-21-56-pm

Looking at the data from this facet, we can see that while CA isn't the highest in terms of technology ideas (they do have Hollywood after all!), they have a relatively higher percentage of ideas in tech. This also reaffirms the view that the usage of Kickstarter is skewed towards the film and music communities.

So far, we have seen what communities Kickstarter project ideas come from. The next logical step would be to take a look at these places and see which projects are successful. By going into the Project Explorer module, a user can look at a bubble chart of projects compared by goals and amount pledged. This module is useful for viewing outliers in the Kickstarter communities. We can see which project ideas hit it big with large amounts of money pledged as well as the projects that asked for too much and never got going. One example of a community that I explored was Colorado's live and successful projects:

co

Here we can see a project idea called Fidget Cube which gained a lot of backers and a much larger amount pledged than the expected goal of $15,000. By identifying past and current successful project ideas, one may be able to study mimic their attributes to try to repeat their success. This is also very useful to get a high-level understanding of what is trending within a project category. Interestingly enough, when I was doing this project, I showed this Kickstarter idea to my friend and he instantly backed the project to get a Fidget Cube himself. This demonstrates that this is useful in finding the nuggets of data that may be masked behind the masses of other projects.

Lastly, this app has a 'Data' module which shows all the data in a searchable data table. This is the 'throw in the kitchen sink' module as it allows the user to browse the data and sort the data in any way they want. For me, one question that I wanted to explore with this module was whether people were low-balling their goals to get successful projects. If so, how many of these low-balled projects were extremely successful? By sorting the Goals column by lowest to greatest, I actually observed a large number of projects with a goal of $1. More surprisingly, a lot of these projects were extremely successful if you took into consideration the ratio of goals vs. pledged amount.

screen-shot-2016-11-06-at-9-22-59-pm

Are these guys playing the system?

While it's a no brainer that these projects are successful (they only require $1 to be...), they show a different side of Kickstarter. It almost appears that these people are using a "pay what you want" method as any amount pledged would hit the goal of the project. This may be useful if you do not know how much you need as a goal but feel like you don't need too much money to succeed.

All in all, this app allows you to explore different elements of Kickstarter projects which interest the end user. This app is great for identifying and examining outliers via the bubble chart and data table. It is also effective for examining the breakdown of communities and support within Kickstarter. In the future, I would like to add more high level graphs in the landing page to help with getting an executive summary without going through all the tabs. I would also want to add more Javascript functionality and maybe implement some statistical tests as modules.

 

About Author

Regan Yee

Regan Yee

Regan is an aspiring data scientist who comes from a computer science background. He obtained his Bachelors degree from Northeastern University in Computer Science. After graduating, Regan worked at State Street Global Advisors on business intelligence systems, performing...
View all posts by Regan Yee >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp