What is driving customer churn at Telco?

Posted on Jan 6, 2022

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


Telco is a fictitious telecommunications company that provides phone, internet service, streaming movies, streaming TV and online security services.  Like any company that offers contract / subscription based services customers canceling services is an ongoing problem.  It is important to determine what is driving this churn and identify customer segments at risk so Telco can take steps to minimize their losses.  It is a lot more expensive to acquire new customers than it is to invest in retaining existing ones.



The goal of this analysis is to identify company factors driving customer churn at Telco, analyze the impact of demographics and identify segments likely to churn and make recommendations to help them reduce attrition.


The data is from Kaggle.  There are 7,043 customers.

In this data, churn is defined as someone who left within the last month.

There is customer tenure (in months), payment information, the plans each individual is enrolled in and demographic information (gender, senior citizen status, partner and dependents).



Numeric Factors

The numeric factors were broken out into deciles for analysis.

A decile is comprised of ten percentiles.  For example, if a student's test score is in the 90th percentile, it means that he/she scored in the top 10 percent.  It would be equivalent to say that his/her score is in the tenth decile as that also means the top 10 percent.

As the value of a numeric factor increases the range of values in each decile increases.

The churn rate within each decile was calculated and indexed to the overall churn rate of 27% using the following calculations:

Overall Churn Rate = (Number Of Churns) / (Total Number Of Customers)

Index = (Churn Rate Within Each Decile) / Overall Churn Rate) * 100

Categorical Factors

For the categorical factors, each category was indexed to the overall churn rate of 27% using the following calculation:

Index = (Churn Rate Per Category) / (Overall Churn Rate) * 100

Note: when the index of the Churn Rate Per Category is equivalent to the Overall Churn Rate the index = 100


Numeric Factors

Customer Tenure has a strong negative relationship with churn.  Newer customers are much more likely to churn and need to be marketed to with extra care.

Monthly Charges have a positive relationship with churn.

Total Charges have a negative relationship with churn.  Most likely because customers with longer tenure have paid more than newer ones.  The correlation between Customer Tenure and Total Charges is 0.83.


Categorical Factors

Customers with fiber optic internet service are much more likely to churn with an index of 158 and a churn rate of 42%.   44% of customers fall into this category.  These customers need to be incentivized through special promotions to reduce churn.

Customers who pay by electronic check are more likely to churn with an index of 171 and a churn rate of 45%.  34% of customers pay this way.  These  customers need to be encouraged to auto pay by credit card or bank transfer. In my professional experience analyzing churn data in another industry customers who auto pay are much less likely to churn.

Customers with month-to-month contracts are more likely to churn with an index of 161 and a churn rate of 43%.  55% of  customers fall into this category.  These customers need to be offered better deals not only because they comprise such a large percentage of customers but also higher monthly charges are associated with higher churn rates.

Customers with fiber optic internet service have much higher monthly charges than those with DSL.  The median value and spread are both much higher than that of DSL or no service.

Customers who pay by electronic check have higher monthly charges.  The median value indicates this and the spread is not as wide as it is for other forms of payment.

Customers with month to month contracts have higher monthly charges.  The median value is higher and the spread is narrower than other types of contracts.

Multiple Lines do not impact churn at all with all three categories having an index of about 100 which means that the churn rate in each category is about the same as the overall churn rate of 27%.

Phone Service does not impact churn much with index values very close to 100.

Demographic Factors

Drop down menus for gender and senior citizen are in the app so their impact can be explored.

Click here to see the app

Gender does not impact  churn.

Senior Citizens are more likely to churn.


The newest customers are more likely to attrite given the strong negative relationship between customer tenure and churn.  They need to be treated with extra care.

Monthly charges have a positive relationship with churn.

Customers with any of the following have higher churn rates and higher monthly charges than other customers.  All of the customers need to be incentivized with special promotions to improve retention :

  • fiber optic internet service (44% of customers)
  • electronic check payments (34% of customers)
  • month-to-month contracts (55 % of customers).

However, if an offer for a discount in service is sent to customers who fall into any of those categories,  the majority of the customers would be offered a discount and the company would loose money.  The ideal way to improve retention is to predict each customer's probability of  churning by building a model, identify a cutoff probability through analysis  and only offer a discount to those customers with a predicted probability above that cutoff.

A model  would take all the factors in the app, add others from the data and use all of that information together to predict each customers probability of churning.

If data from a third party vendor like Acxiom is appended that would result in a stronger model as it would add  more demographic data, customer interests etc.

Next Steps

Build a churn model

github link


About Author

Denise Garbato

I am a Statistician and Business Analyst who supports strategic decision making in digital and traditional marketing channels by discovering insights, applying statistical and programming skills with a results-focused approach. I am skilled in data analysis and predictive...
View all posts by Denise Garbato >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI