2015 Health Insurance Marketplace Data Analysis

Avatar
Posted on May 4, 2016

Contributed by Ruonan Ding. She is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between April 11th to July 1st, 2016. This post is based on her first class project - R visualization (due on the 2nd week of the program).

The Affordable Care Act, aka ObamaCare, is a federal statue that was signed by President Obama on 3/23/2010.  The healthcare industry has gone through various changes ever since. The health insurance marketplace is a virtual marketplace that is provided by private insurance carriers.  If someone is not eligible for the government health program (Medicare and Medicaid) or not covered by the employer's plan, health insurance marketplace is the all-in-one marketplace to shop.

The dataset is hosted by The Centers for Medicare & Medicaid Services (CMS).  The Consumer Information and Insurance Oversight (CCIIO) within CMS is committed to increasing transparency in the Health Insurance Marketplace.  The Health Insurance Marketplace Public Use Files (Marketplace PUF) are available for plan years 2014 and 2015 to support timely benefit and rate analysis.  The Marketplace PUF includes data from states participating in the Federally Facilitated Marketplaces (FFM).  The Marketplace PUF does not contain any data on plans offered in states that established and operate their own Marketplace (State-based Marketplace).  For this purpose of this analysis, we used filed Rate and BenefitsAttributes files and focused on the plan year of 2015 Individual plans only.

Rplot002

The median monthly premium distribution gives a brief overview of the monthly premium being offered by state.  It shows a quite wide range of the median premium range.  That inspires a series of research questions.

I.  Plan Coverage Type

Plans in the Health Insurance Marketplace are presented in 4 "metal” categories: Bronze, Silver, Gold, and Platinum. Catastrophic is also available for some people.  Metal categories are based on how you and your plan split the costs of your health care. They have nothing to do with quality of care.  But it standardizes the various plans out on the market to one platform.

__results___1_1

The boxplot of premium distribution by the metal coverage categories shows the difference in premium levels by plans.  Note that High and Low are for Dental insurance only.  In this graph we can see that  Platinum plans has the widest range of middle 25% to 75% premium with the highest median premium over $500/month.  Catastrophic has the lowest premium with the most narrow distribution in the 25%-75% percentile.  The other interesting fact is that the range of outliers in every plan is quite large, which means that there are various premium points being offered.  The red dots in every box is the mean of the metal category.   The premium distribution is all skewed to the right because the median is less the mean, which means that most of the plans are offered in the lower price range. We can conclude that plan metal coverages affect the premium.

II. Plan Premium By Age

The next questions we want to assess is how the premium varies with the increase of age.  Intuitively the older you are, the more risk you potentially carry for any health related issues.  Therefore, an upward trend is expected in their case.

__results___1_2

There are a couple interesting facts that show up in their graph.  First, we noticed that 42-45 is where the speed of increase in the price start to pick up.  It also means that when you are older than 42-45 , the premium is more penalized every additional year you age.  This pattern is consistent through all metal categories plans.  The other interesting fact is that prior to age 42-45, the mean premium between different plans are roughly fixed that is, on the graph, parallel on the graph.  After the turning point, the more comprehensive the plan is, the more you need to pay as age grows.  The parallel curves do not hold.  They start to fan out after age 43.

III. State of Residency

The next thing we want to inspect is whether the state residency will make a significant different in the premium level too.  In order to assess this more effectively, we make two assumptions.  First, we assume people in certain states just have less plans to choose from so that they need to pay more premium.  Second, since insurance industry falls under the statutory regulation, is it possible that certain states have a higher barrier to entry? In that case, every single type of plans will carry a higher minimum premium.

The next two graphs assess our assumption 1: whether the number of plans available will affect the premium level.

Rplot001

Rplot003

 

From the first graph, states are ordered from the most plans available to the least.  The coloring indicates the number of participating insurance carriers are available in that state.  It shows the relationship that the more participating carriers there are in a state, the more different plans were designed.  However, the next graph shows that the number of participating carrier in state does not affect the premium level very much.  The three boxes actually has very similar distribution regardless the number of carriers.

We validate the second assumption now: whether some states just have a higher barrier to enter.  In order to visualize this, we look at the average minimum premium by states.

MIN PREM

 

MIN PREN BY METALFirst graphs is to rank the states by the average minimum premium. Second graph's goal is to check whether the premium trend hold for different metal level plans while maintaining the same state rank from previous slide.  In this case, we validate that the premium trend holds regardless the metal level.  Therefore, it confirms our assumption that there are more expensive states to enter.

In conclusion, my analysis confirms that there are at least three variables that affects premium levels: benefit type (metal level), age, and state of residency.  To follow up this research in the future, we can also do premium price distribution fitting so that both of insured and insurers can now where they in terms of price in the overall market place for a specific type of plan.

About Author

Avatar

Ruonan Ding

Ruonan Ding has more than five years of experience in the actuarial science and financial field across asset management and insurance sectors. She was a pricing actuary for a property and casualty company, a lead analyst in capital...
View all posts by Ruonan Ding >

Related Articles

Leave a Comment

Avatar
bongacams token generator August 4, 2016
Thanks to ToolsJungle designers, developers and coders Cam4 Hack is genuinely tough to detect.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp