Using Data to Analyze Successful Bank Marketing Campaigns

Posted on Aug 29, 2022

The skills the author demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Introduction

Based on data, financial institutions often need to run a campaign in order to sell a product to potential customers. These campaigns cost time, money, and provide inconvenience to the people contacted if the product offered to them is a poor match. As a result of being inconvenienced, the customers can start having negative sentiments towards the bank, leading to reputational damage for the bank and the resulting profit loss from diminished long-term customer value. In addition, most customers will say no in the end, which makes identifying those that will say yes particularly challenging.

In the parlance of machine learning, this is known as an imbalanced classification problem. In this project, I solve this problem for the case of a Portuguese bank running telephone campaigns to sell long-term deposits. While this problem is challenging, much headway can be made to help the bank reach the right customers, save tens of thousands of dollars in concrete costs, and protect its reputation.

Machine Learning

The data for this project was collected between May 2008 and June 2013 by a Portuguese banking institution and is available through UCI here. There are 45211 observations with features on bank client data (age, job type, marital status, education, housing and loans status), information regarding the last contact (including a leaked variable duration, which shall be dropped), information pertaining to previous campaigns, and some social or economic context variables (such as the 3-month borrowing rate between banks).

The goal is to predict whether the person contacted will subscribe to the long-term deposit, which is a kind of security deposit granting the lender a higher interest rate than a traditional savings account and granting the bank a guarantee of having the lender's funds for a fixed period (such as 12 months). In the data, only about 11% of people contacted end up subscribing, making this classification problem highly imbalanced.

I've tried several approaches to address the imbalance issue: using the data as is, rebalancing using the SMOTE algorithm, and simple rebalancing based on sampling the minority class at a higher rate. In addition, I've tried XGBoost and random forest classification in R. Finally, careful attention must be paid to the choice of the metric, which I will discuss in more detail below.

Accuracy

Accuracy is not an acceptable metric for this problem: Classifying all customers as 'No' customers will achieve 89% accuracy without providing any insight into our group of interest, the 'Yes' customers. Since the positive class is the 'Yes' class, false positives would amount to predicting that the person will say yes when they will say no. I would like to avoid these noes to minimize inconvenience to our customers, the resulting reputational damage to the bank, and time lost on contacting the no customers.

However, I would tolerate some false positives to get more yeses and would not maximize precision, which is the ratio of true positive to true positives plus false positives, per se. False negatives are when one predicts that the customer will say yes when they will say no. The cost of this prediction is not getting a client, something one would wish to avoid in a sales situation. The ratio of true positives to true positives plus false negatives is known as recall, and it is of greater importance for this problem.

Nonetheless, I would like to strike a balance between precision and recall, using the ROC metric. This metric balances out the considerations regarding wanting a small number of both false positives and false negatives, and, in view of the other tools used to solve this problem, leads to the highest recall that can be achieved.

Final Selection

The final selection of rebalancing, model, and metric that I made is simple rebalancing, random forest model, and ROC metric. XGBoost was particularly prone to overfitting on this data, SMOTE seemed to introduce too much extra noise, and other metrics (such maximizing recall directly) did not work as well as maximizing ROC. I addressed the overfitting issue by ensuring that the nodes have a reasonable minimum number of observations (in this case, at least 40 observations in each final node were chosen).

The final model achieved an ROC score of .775 and identified a group of clients particularly likely to respond positively (3 of 8 clients identified would say yes). The group that can be reached by following the recommendations of this ML model corresponds to 63% of all the people that would say yes. The concern is, of course, that this would not be enough for the bank. I addressed this concern by lowering the classification threshold to .3 instead of the default .5, allowing 80% of the yes customers to be reached at a somewhat higher cost of noes. I will address the business value of these models below.

Business Value

Suppose the bank obtains 100,000 records of potential customers and would like to determine which of these people to contact. Assuming customers likely to say yes are uniformly distributed within this data, the following table summarizes three possible approaches of contacting customers along with their corresponding costs.

Note that in both ML and No ML cases, the goal is to reach 63% of all yeses in the data. Once this target is reached, the bank's agents/telemarketers stop calling the potential customers. Since the ML strategy gives the bank information helpful for reaching the right customers, the bank can save money and intangible costs that would otherwise be spent on reaching the noes unnecessarily. The lower bound calculations assume $10.00 per hour rate (converted from euros) for telemarketers time and the upper bound calculations assume $20.00 for bank employees' time.

Hourly Salaries

The actual hourly salaries of each of these worker groups are a little lower, $8.08 and $16.00, respectively, but I'm assuming workers need some time between the yes/no calls (perhaps for non-responding customers or data lookup/entry) and base the rates off time spent on call.A natural concern is that 63% is not good enough to meet the bank's objectives. In that case, by lowering the classification threshold for a yes to .30, 80% of all the yeses can be reached, albeit at a higher cost.

After reaching the likely responders, I would suggest that the bank use the extra time it saves through the use of one of these strategies to target a different product to the customers unlikely to respond with a yes. It could also be the case that the bank determines that long-term deposits are the most profitable product that the it can offer its customers. In such case, the bank could decide to simply call all of its potential customers and accept the higher costs. In the end, this is as far as machine learning can take us, and the bank would need to conduct A/B tests for each of the three strategies before deploying the best one on all of its potential clients.

R Shiny App and Conclusions

I developed an R Shiny app to help bank employees determine if they should offer a long-term deposit to a customer. The context is that an employee may have an in-bank meeting or a telephone call with a customer regarding a different issue, but they could enter the customer's information into the app to determine if they should also pitch the long-term security deposit. The app provides the probability that the customer will say yes, then suggests that the agent offer the product if the probability of customer accepting is above .5 if the agent is conservative or .3 if the agent is willing to take a bigger risk.

I believe that had the data been collecting around 2022, there would be more features one could use to build a much stronger predictive model. For example, the bank could perform NLP analysis to dissect agent/client interactions to determine which of agent's actions lead to higher customer conversion rates. Apart from this example, in the days of expanding data collection, there are almost certainly other features one could obtain to build an even stronger model, and the bank should consult experts in this regard. In addition, the financial crisis occurred during the data collection period.

Timestamps

However, the timestamps are not available in the data and cannot be unambiguously inferred from other variables that have a time component (such as the 3-month European inter-bank borrowing rates). If the timestamp had been available, I could train a model only on data points that were not collected during the 2008 financial crisis to build a stronger model. Finally, as briefly mentioned above, it is imperative to A/B-test these machine learning models before using them in production and carefully monitor for data/model drift once the models are deployed.

References

Image source:
https://commons.wikimedia.org/wiki/File:Zarco_%26_Bank_of_Portugal_(Funchal)_(38044349796).jpg
Data source:
https://archive.ics.uci.edu/ml/datasets/Bank+Marketing#
Portugal bank employees and call center employees salary information:
salaryexplorer.com and https://www.erieri.com/salary/job/call-center-agent/Portugal

About Author

Dmitriy Popov-Velasco

I'm a recent NYC DSA/fastai graduate with background in math, economics, and education, holding graduate degrees in these areas. I'm passionate about helping others and solving practical problems!
View all posts by Dmitriy Popov-Velasco >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI