Kickstarter Project, How Do you successfully launch one?

Posted on May 12, 2019

Project GitHub | LinkedIn:   Niki   Moritz   Hao-Wei   Matthew   Oren

The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

According to Kickstarter, of all projects launched, only 36% have been successful to reach their funding goal. More than 450K projects have been launched so far which means approximately 288K of them could not convince people that they will be successful in business terms and probably change the way of doing business in a specific field. In whichever side you are, either a project owner or a backer, it is important to learn about the factors that make a campaign successful. If you are curious, let's investigate those factors together.

Data

Kickstarter hides most of the completed projects (Successful, Failed, Canceled etc.) from its search. Luckily, there is another web site which keeps track of all completed projects. I scraped 8028 projects from Kicktraq's archived projects page. Then, I followed the link to go to each project's original Kickstarter page. The Kickstarter page provides me some additional attributes on top of the attributes I already scraped from the Kicktraq page which allows me to combine all of them into one set of attributes for each project. If you want to conduct your own statistical analysis, please go ahead and download the CSV file from Kaggle.

I focused on the projects originated from the U.S. only. Moreover, I eliminated the projects cancelled either by the owner or the Kickstarter as well as suspended projects from my analysis. Hence, the total number of projects in this analysis reduced to 3911.

Before continuing to the details of analysis, I would like to explain the variables I created manually and their meaning. Since the number of projects in this analysis constitutes only 1% of the total number of completed projects, I had to group the numerical variables and categorize them to give a clearer output for the audience. Here are the variables I created and their brief descriptions:

  • num_updates_range:
    • '<5' if number of updates is smaller than 5
    • '<10' if number of updates is smaller than 10
    • '>=10' if number of updates is greater than or equal to 10
  • num_faqs2:
    • 'With FAQs' if project has at least 1 FAQ
    • 'W/Out FAQs' if project does not have FAQ
  • funding_goal_range:
    • '<1000' if funding goal is smaller than $1,000
    • '<10000' if funding goal is smaller than $10,000
    • '<50000' if funding goal is smaller than $50,000
    • '<100000' if funding goal is smaller than $100,000
    • '>=100000' if funding goal is greater than or equal to $100,000
  • funding_percentage_range:
    • '<1% Funded' if funding percentage is less than 1%
    • '1-20% Funded' if funding percentage is between 1% and 20%
    • '20-40% Funded' if funding percentage is between 20% and 40%
    • '40-100% Funded' if funding percentage is between 40% and 100%
    • '<1.1X Funded' if funding percentage is less than 110%
    • '1.1X-2X Funded' if funding percentage is between 110% and 200%
    • '2X-10X Funded' if funding percentage is between 2 times and 10 times greater than the funding goal
    • 'Greater 10X Funded' if funding percentage is greater than 10 times of the funding goal
  • duration_range:
    • '<=15 Days' if duration is between 1 day and 15 days (including)
    • '<=30 Days' if duration is between 15 days (excluding) and 30 days (including)
    • '<=45 Days' if duration is between 30 days (excluding) and 45 days (including)
    • '<=60 Days' if duration is between 45 days (excluding) and 60 days (including)
  • project_video2:
    • 'With Video' if project has at least 1 video in its description
    • 'W/out Video' if project does not have video in its description
  • project_img2:
    • 'With Image' if project has at least 1 image in its description
    • 'W/out Image' if project does not have image in its description

Analysis

Projects by Numbers

Out of 3911 completed projects I scraped, 2306 are successful which accounts for around 59%. A basic EDA can illuminate more on what the dataset looks like.

Comparing the mean values of some of the most important variables in the dataset revealed the characteristics of successful and failed projects. You can see that on average failed projects had 4.6 times higher funding goal than their successful counterparts. Whereas, successful projects raised 22 times more funding than failed campaigns. Average number of comments and updates in a successful project is much higher compared to the failed ones. Having more images and videos on the project description also seems effective in the fate of a project. Another interesting result was that the average duration of failed projects (37.71 days) were longer than the average duration of successful projects (30.17 days). I think, keeping the duration shorter (around 30 days) makes people share the word of mouth and creates a sense of urgency.

Unsurprisingly, most of the top 10 categories by funding raised are related to technology. Two of categories which are not within technology area are Music and Comic Books. Tabletop games make more than $20 million and product design and animation follow with $15 million and $12 million.

However, more interestingly, tabletop games category once again makes to the top of the success rate. This category raises more funding than other categories and the success rate is also higher (80%). Yet, we cannot say the same for product design, animation and video games as they find themselves down in the list when it comes to success rate.

Figure-4: Top-10 U.S. States by Success Ratio

I can hear you asking "So which is the most successful US State?" Figure-4 shows that Washington, New York and California are doing very well in creating successful projects having more than 60% success rate. The map on the right is a Google geovischart. It interactively displays the the state name and its success rate when you hover over the state. I will put the map into the Shiny application and give you the link to the app when I complete the design of the app.

Detailed Investigation

Let's move on by looking at each our variables.

Duration

Figure-5: Campaign Duration Proportion by Status

In order to analyze the relationship between campaign duration and campaign status, we may have a look at the proportion of campaign duration for successful and failed campaigns. We can see that 25% of the time failed campaigns had 45 to 60 days of duration while only 6% of the successful campaigns had 45 to 60 days campaign duration. The proportion of 15 to 30 days-long campaigns for successful projects is higher (67%) than the proportion of 15 to 30 days-long campaigns for the failed projects (48%). It seems like keeping the campaign duration less than 30 days is promising in launching a successful campaign.

Funding Goal

Figure-6: Success rate w.r.t. Campaign Funding Goal

Our next dependent variable is the funding goal. My expectation was that the higher the funding goal the lower the success ratio. To examine funding goal, I created a new range variable (Funding Goal Range) which categorizes the funding goals in ranges. The hypothesis was satisfied by the data that 3 out of every 4 projects with less than $1000 goal actually reached their goal and became successful. Success ratio drops to less than 20% for the projects with greater than $100,000 funding goal. Before starting a project, be wise and think about what may be the minimum funding amount to begin with, as there is a negative relationship with the success ratio and funding goal.

Funding Margin

Figure-7: Funding Margin by Project Type

On the left hand side of Figure-7, you can see that almost one third (28%) of successful campaigns are funded just 10% higher their goals which means that those campaigns did just good enough to be successful. On the other hand, only 8% of successful campaigns can be regarded as over-achievers. In general, the funding margin of successful campaigns is pretty high. Unlike successful campaigns, more than 90% of failed campaigns cannot even reach 50% of their funding goal. This indicates that when you fail, you fail big.

Featured or Non-Featured? Does it matter?

Figure-8: Featured Listing Status vs. Funding Percentage Range for Successful Campaigns

The answer is: No. Kickstarter features some of the campaigns with a title "Campaigns we love" and not surprisingly all featured campaigns are successful campaigns. The question is do campaigns featured by Kickstarter do any better in raising funds than other non-featured successful campaigns? As Figure-8 depicts, the proportion of funding percentage range does not differ much across featured and non-featured campaigns. So, do not bother yourself to be listed as a featured campaign because the funding you raised will not change much.

Pledge Tiers

Figure-9: Number of Pledge Tiers and Success Ratio

How many pledge tiers do you need to set for your campaign? 5 or 100? Well, according to the dataset, number of pledge tiers you choose to have has a positive correlation with the success rate of your project. But, up to a certain number. Having more than 25 pledge tiers does not add any value to your project's success. So, it can be suggested to keep it no less than 25. 

Image and Video

Figure-10: Having Video in Project Description vs. Success Rate

I think, including a video clip in your project description which explains what your product will look like or what you want to achieve if you succeed would be very crucial in determining your chances of success. However, it was not as much effective as I assumed. Still, you can expect to have a higher rate of success if you add a video clip.


Figure-11: Having Image in Project Description vs. Success Rate

The success ratio of projects with no images in their description is less than 2%. Success ratio increases to 63% if you include an image. Take your time and add at least 1 image in your project so to maximize your chances.

Number of Updates

Figure-12: Number of Updates vs. Success Rate

Among the projects which their number of updates is less than 5, the success rate is around 30%. This rate increases to 80% among the projects with number of updates between 5 to 10. The rate of positive effect on success rate decreases but the pattern still follows an upward trend with a success rate of 88% if you make more than 10 updates to your project. Providing updates related to your product, giving details about your milestones in your project timeline and sharing them gives a positive impression on your audience.

Conclusion

There are many factors that make a project successful of unsuccessful. In this Web Scraping project, I mainly focus on the ones I assume to have more influence. As I expected, I found most of them to play a significant role in determining the odds of success, except the "featured" boolean variable. To sum up;

  • Choose Duration <= 30 days, instead of the max. allowed campaign duration of 60 days.
  • The smaller the goal, the higher the success rate.
  • To be listed as a featured project is not important. Do not waste your time on this factor.
  • Number of pledge tiers up to 25 increases chances of success.
  • Having video has positive effect.
  • Images have huge impact.
  • Updates to your product gives good impression on your audience.

Future Research

The suggestions made in the conclusion part are only the basics. If you plan to launch a Kickstarter campaign and you want it to be successful in raising the funding you aimed (I cannot think otherwise), you need to take them into account.

I want to improve the project mainly in two ways:

  1. Apply multiple linear regression (or logistic regression if categorical variables are found to be significant) on "funding raised" variable.
  2. Predict the probability of success for a project with given parameters.

About Author

Ali Uysal

Ali is a passionate data scientist with an educational background in Engineering and Economics. He has more than 9 years of work experience in various IT roles. He started his career as a Support Engineer at Vodafone Turkey...
View all posts by Ali Uysal >

Leave a Comment

Anu June 3, 2019
I love this project! It's super simple in terms of it's aim and its effective in its outcome. I think the graphics could be more clear, like the graph for unsuccessful and successful should be more like a bar graph side by side so you can actually compare the data visually a lot easier.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI