FundRazr Online Fundraising Campaign successful?

Posted on Jan 7, 2019

Project GitHub | LinkedIn:   Niki   Moritz   Hao-Wei   Matthew   Oren

The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


Created in Canada in 2009, FundRazr is a crowdfunding site designed to allow individual, nonprofits or companies to set up a campaign for a cause and people who believe in it to contribute online. The website separates similar campaigns in 18 different categories. The campaign sets a goal for the amount of money that the campaigner wants to raise with an option to set an end date as well.



To scrape data of the different campaigns on Fundrazr and to use the data that has been scraped to draw insights on what makes a crowdfunding campaign successful.



The first step was to create a scrapy script that would iterate through each of the 18 categories on fundrazr.

I created a loop that would run through and gather different urls for the respective starting pages of each category. Then a second loop was created using the urls provided by the first loop, with a page number added from 1 to 10 for each category. The reason I limited the pages scraped was because I had to work within my time constraint and could not allow unlimited  scraping time. The second loop also extracted all the different project urls for each page per category. The last loop scraped the data in each individual campaign necessary for analysis.

The data that was scraped was the title of project, the category the project was sorted in, the currency the donations were given in, the number of contributors, the target amount of donations, the amount raised,when the campaign ends, the amount of updates on the page, and the number of comments on each page.


The picture below illustrates one campaign with these factors:


Data Cleaning

After extracting the data, I had to clean and modify it to prepare it for analysis. Below is a list of the adjustments that I made:

  1. Dropped NA values that are MCAR(missing completely at random)without imputing any values.
  2. Removed the currency labels ($|£|€|₽|kr|₱|Fr|₪|฿|¥), the commas, and periods by using regular expressions.
  3. Created a function that would convert strings into numeric values in the target column(e.g. 2.5k to 2500).
  4. Changed start date and end date columns to datetime and then subtracted the columns to create a total days columns. Afterwards, I changed the days columns values to positive numbers divided by seven to get the total weeks’ duration the campaign was open.
  5. Divided the amount raised by the number of contributors to get average contribution.
  6. Divided amount raised by target to get percent_complt as the metric used to see how successful the project is.


Below you can see the table of these factors with each campaign:

Data Analysis


Below is a look at the success of campaigns in general:

As you can see above, the campaigns that are funded most often and the most in general are campaigns that last from 0 to 25 weeks. However, there are also a few outliers so I created a new scatter plot with percentages between 0 and 100 percent:

A majority of projects are completed only 20% of the time. A more coherent graph portraying this is a distribution plot of percent funded below:

The data indicates that campaigns tend to be completed when their target falls below 5000 dollars:

The next analysis that was performed was on the different categories available to see if categories affect the success rate:

Unsurprisingly, non-profits,health, and sports categories were funded with campaigns in those categories receiving on average 20 to 40 percent. On the other hand, their campaigns target settings are quite low. The opposite can be said for categories that show a low percentage of funded campaigns , as their targets are a bit too high.

The health category had the maximum average contribution, though it also had some of the smallest number of contributors. However, this could also be because the target usually set by campaigns in this category is so low. This is true for the business category as well. However, as the average target is set   too high to achieve funding, the percent funded is low.

Lastly, I checked the effects of features, Comments and Updates, that made the campaign more visible to viewers. The number of times the campaign creator updated his page had no direct impact on the extent to which campaigns were funded. As you can see below, the amount of updates led to a variable amount of funding on average:

The same can be said for comments, as the amount of funding changed randomly per each number of comments made on each campaign page:

Further work


  1. Gather a lot more data
  2. Make a graph to determine if the amount of comments and updates per category make a difference in percent funded.


About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI