Will My Kiva Loan Get Funded?

Posted on Aug 17, 2016

To view all the code behind this post, please click here | If you wish to donate to Kiva, please visit their website here

Kiva Basics

Microlending has become increasingly popular in recent years. If you've never heard of the concept before, microlending is a method of poverty alleviation implemented in the developing world. Small amounts of capital is loaned to people who would not normally qualify for funds at a bank, which theoretically allows them to lift themselves and their family out of poverty. The model is somewhat controversial, as some think that it only exacerbates poverty, however that discussion is outside the range of this analysis.

Kiva.com is the most successful of the microloan sites, boasting over 1.2 MM loans funded. Kiva aggregates a number of partner organizations who are on the ground distributing money, but don't have the infrastructure or the public profile to receive funding from the general public. Kiva's users decide which loans they want to fund and are then paid back over a set period of time (generally from 8-24 months). In this model Kiva's users assume the risk, as they are not guaranteed to receive their money back.

Not all the loans posted to Kiva get fully funded, though. If the loan is not fulfilled in thirty days after the loan is posted, the posting is removed from the site. Any users who have already given money are refunded and no more donations are accepted. In this situation, Kiva's partner assumes the risk if the loan defaults, making them less likely to work with Kiva in the future. Thus, it is imperative that Kiva fully funds as many loans as it can. In this project, I'll examine what goes into a successfully funded loan and what Kiva can do to increase its odds of successfully funding their loans.

Getting the Data

There are over 1.2 MM unique loans listed on Kiva's site, each with its own unique url. A good amount of information is listed on the loan page, an example of which can be seen below.

Screen Shot 2016-08-12 at 3.41.15 PM

From this page, I scraped a host of variables to be used in my analysis: loan size, loan length, location, industry, and whether the loan was funded or not. I also determined the gender of the borrower from identifying gendered pronouns in the blurb about the loan: "he", "she", "him", and "her".


Not every blurb contained one of the aforementioned gendered pronouns, so I implemented sklearn's predict function to determine the gender. I used logistic regression with predictor variables of loan length and loan amount.

If the name of the person receiving the loan contained the word "group", the loan was classified as part of a group loan and gender was marked as neither male nor female.


As there are over 1.2 MM unique loan pages, it was infeasible for me to scrape all of Kiva's loan pages with my rapidly decaying Macbook. I ran the scraping program for three days (when I was connected to the internet), resulting in 94,862 unique loans. Of those, 3,406 did not get funded, for an expiration rate of 3.59%.

Screen Shot 2016-08-12 at 5.58.30 PM

The overage loan size was $827. As you can see in the histogram, though, the most popular loan size was in the $200-$400 range. There was a very long tail to the distribution of the loans, with some being above $10,000 (generally to US residents).

Screen Shot 2016-08-12 at 5.55.05 PMScreen Shot 2016-08-16 at 8.37.57 AM

Breaking out the average loan size by gender yields very interesting results, as it becomes apparent that men tend to ask for much larger loans than women. While I didn't investigate this trend in this study, a deeper dive into this data could uncover whether this is due to choices in the type of businesses they're hoping to start, an overestimation of the amount they require, or some other difference between the loan requests of men and women.

Kiva loan recipients also have the option to form a group to ask for their loan and subsequently pay the loan back together, thus theoretically reducing the risk of default. This type of loan was unsurprisingly larger than both men and women. Finally, looking at the average loan size by continent (and the US) shows fairly unsurprising results. The US and Europe have the highest average loan size, while Asia and Africa have the lowest.


Screen Shot 2016-08-12 at 6.28.43 PM

Screen Shot 2016-08-13 at 9.22.10 AM










A fascinating pattern emerges when you break down the loans that expired by gender and by continent.  Kiva users have a clear preference for females and groups, as the rate for men's loans expiring is nearly three times higher than both of the other subcategories. While it's not completely surprising that Kiva users prefer giving to women over men, the magnitude of this gap is quite large. A similar trend emerges when looking at expired loans by continent (and the US). Africa and Asia have very low rates of expired loans ( < 3%), whereas nearly 20% of loans in the US do not get fully funded. Europe and North America (excluding the US) also have a high proportion of expired loans. Again, while this inclination is not surprising, the magnitude of the difference is somewhat shocking.


Regression Analysis

I dove a little deeper into the analysis between these variables, as I wanted to determine what were the biggest drivers of expiring loans. To create this model, I used the statsmodel package in python to perform a logit regression of the expiration binary variable on the length of the loan, the gender of the borrower, whether they are part of a group or not, and the continent where they live. Being based in the US was excluded from this regression to avoid collinearity issues. Thus, all the continent coefficients should be interpreted as compared to the USA.

Screen Shot 2016-08-14 at 6.15.24 PM

The regression results support what the graphics suggested. Being female leads to a much lower chance of your loan expiring. In other words, women on kiva have a higher likelihood of their loan getting funded, 64.1% higher to be specific. Being part of a group is also advantageous, with this model suggesting that you're 58% more likely to be funded. Being based in Africa offers a whopping 68% advantage over those based in the US, with Asia also offering a handsome 63.6% advantage. Finally, for each month you add to your loan, you can expect a 3.7% increase in the likelihood of your loan not getting funded. People like getting their money back quickly, probably the least surprising finding in this analysis.


Kiva's user base has a clear preference: underrepresented groups in poorer nations. If Kiva stumbles upon a funding drought or a shortage in users willing to give money, it would likely be advantageous for them to host more of these types of loans. If they wish to cast a broader net, they can open up their site to more borrowers from developed nations, where there is less of a focus on extreme poverty alleviation. This may result in a dip of loans getting funded, though, which is a risk Kiva will have to weigh against the benefits.

Finally, the answer to question posed in the title of this post. Will my Kiva loan get funded? Well, I'm a man living in the United States, and as far as I know, no one wants to form a group with me to apply for a loan. The evidence says I won't get my loan funded - at least it's not as probable as most of the world. That's probably a good thing.

About Author

Christian Holmes

Christian Holmes is a graduate of Middlebury College with a B.A. in both Economics and Chemistry. Upon graduating, he spent two years as a data analyst at an advertising technology startup, where he became interested in predictive analytics....
View all posts by Christian Holmes >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI