Getting higher quality & amp; lower cost with Medicare

Posted on May 5, 2017

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


This is an interactive hospital recommendation APP implemented in R using Shiny. This shiny App is intended to help patients to find hospitals where they will receive high-quality care with reasonable co-payment. Users can see direct out-of-pocket costs of certain procedures and the quality index of hospitals to identify their best choice for  for their personal health conditions and budgets.  

Github codes are available here.

Background & Data introduction:

1) What is Medicare? Medicare is the federal health insurance program for people who are 65 or older, certain younger people with disabilities, and people with End-Stage Renal Disease.

2) Medicare hospital quality data: The overall rating summarizes up to 57 quality measures over 3,000 U.S. Hospitals. The overall rating ranges from one to five stars. The more stars, the better the hospital performance on the available quality measures.

3) Medicare inpatient charge data: The data include hospital-specific charges for the more than 3,000 U.S. hospitals that receive Medicare Inpatient Prospective Payment System payments for discharges paid under Medicare based on a rate per discharge using the Medicare Severity Diagnosis Related Group (MS-DRG) for Fiscal Year (FY) 2011, 2012, 2013, and 2014.

Motivation: Why I made this Medicare recommendation application

Screen Shot 2017-04-30 at 10.47.28 PM

Best quality hospitals aren’t necessarily high-priced, but hospitals with worst quality of care sometimes are.

1) Hospital billing varies widely, and the pricing does not reflect quality of care.

The top  performing hospitals aren’t necessarily high-priced, though the worst performing ones sometimes are.

The report 'Medicare Provider Utilization and Payment Data',  released by CMS has shown huge inequities in hospital charges. Take total joint replacement as an example; prices of this procedure ranges from $5,304 in Ada, Oklahoma to $223,373 in Monterey, California. [1]   Also there is no correlation between price and r quality. When conducting analysis on the CMS quality and cost data, one find that that best quality hospitals aren’t necessarily high-priced, while hospitals with the worst quality of care sometimes are.

2) Information is highly non-transparent and isolated, making it difficult for patients to make informed decisions:

Patients are an insecure group of people, worried about the risk involved in their treatments, along with breathtaking six-figure hospital bills. The whole healthcare market is like a 'Black Box,'   The price information on healthcare is almost entirely blocked from customers. Although quality information is available online, it is hard to guarantee the authenticity. Also as, cost and quality information are never put together,  patients don’t get to see the data they need to make wise decisions.

3) Why  we need a Medicare hospital recommendation application:

Based on aforementioned problem, an application allowing patients to compare cost and quality information at different hospitals seems to be an effective solution. This Medicare hospital recommendation system, assembling 100+ DRGs' cost information and quality rating information over 3000 hospitals, is able to bridge that knowledge gap that will  allow patients to be informed about costs and the quality of care to expect..

Data visualization: Getting insights of hospital quality and cost with EDA.

EDA is crucial in the process of application development. To gain a deeper insight into the data,  we need an  analysis on hospital type, cost - quality comparison, and  DRG cost comparison.

I. Hospital quick facts:

  1. TX (407 hospitals) and CA (341 hospitals) are top states with the most hospitals: There are 4807 hospitals in U.S. It is unsurprising that  TX and CA are the top 2 states as they are the largest of the well developed states. The third place - FL only has 186 hospitals, much less than CA.Screen Shot 2017-05-04 at 12.25.45 AM
  2. Central part of US has the highest number of hospital bed per 1000 capita: District of Columbia, South Dakota and North Dakota are the top 3 states with the most hospital beds per 1000 capita (on average 4.7 beds / 1000 capita). However it is interesting to see TX only has 2.3 beds/1k capita (ranked No.23/50) and CA has 1.8 beds/1k capita (ranked No.48/50). Even though CA and TX have the ability to provide advancing healthcare services, the healthcare resources allocated for each person is limited. If you live in CA and TX, you have greater odd of having to wait for a longer time for healthcare  services.Screen Shot 2017-05-04 at 12.32.39 AM

II. Hospital quality analysis:

  1. Hospital quality ratings are normally distributed: 37% (1774) are 3-star hospitals and only 2% (82) are 5-star hospitals: [figure] The new overall star ratings take the existing measurements reported on Hospital Compare, and summarize them into  a single star-based rating for a hospital. Though there are 1223 unrated hospitals, based on the quality ratings we do have, we see a normal distributed that is slightly left skewed. Although the American Hospital Association reported this system oversimplifying the complexity of delivering high-quality health care [3], I am still using this dataset because it is the first hospital quality dataset released by an independent authority. Even though it may unfairly penalize some teaching hospitals and those serving the poor, it still provides a whole picture of quality across U.S. hospitals.Screen Shot 2017-05-04 at 12.50.07 AM
  2. Central part of U.S. has higher average quality rating:  The map shows the states with higher average quality ratings are concentrated in the middle part of U.S.  This map has some alignment with the map showing the hospital beds/1000 capita, whicht may indicate a positive correlation between healthcare resource / 1000 capita and quality of care.  hospital_type_rating

III. Hospital cost analysis:

Before diving into the cost analysis, some terminologies and questions need to be clarified:


  • Average hospital charges: Providers determine what they will charge for items, services, and procedures provided to patients and these charges are the amount that providers bill for an item, service, or procedure.
  • Average medicare payments: The amount of money Medicare program reimbursing hospital
  • Out of pocket payment: Patients still incur co-payment when receiving services covered by Medicare; that’s their out- of- pocket cost.
  • Average total payments:  Average total payment = Medicare payment + out of pocket payment
  • Medicare reimbursement rate:  Medicare reimbursement rate  = out of pocket payment / total payment

To be mentioned:

  • Who pays the difference between what the provider charges and Medicare pays? The provider has an agreement with Medicare to accept Medicare’s payment and so forfeits the difference between the charged amount and the payment amount.
  • What is the difference between "Average Charges" and "Average Total Payments?" “Average Charges” refers to what the provider bills to Medicare. “Average Total Payments” refers to what Medicare actually pays to the provider as well as co-payment and deductible amounts that the beneficiary is responsible for and payments by third parties for coordination of benefits.

The relationship among these 4 variables:

Screen Shot 2017-05-02 at 12.40.19 AM

How much should Medicare patient expect to pay for a specific DRG procedure?

  1. Grouping 100+ MS-DRG codes into 10 categories: The CMS inpatient charge data-set only contains the 100 most frequently occurred DRGs. It is unclear why CMS only disclosed 100 DRG payment information, but the data should be enough for most application users. To facilitate the visualization, I grouped the 100 DRGs into 10 buckets based on systems, namely, 'Circulate', 'Digestive', 'Infection', 'Kidney', 'Metabolism', 'Nerve', 'Ortho', 'Respiratory', 'Toxic' and 'Others.'
  2. What is MS-DRG: A Medicare Severity-Diagnosis Related Group (MS-DRG) is a system of classifying a Medicare patient’s hospital stay into various groups in order to facilitate payment of services. The general DRG system separates all of potential human disease diagnoses into 20+ body systems, and then subdivides those systems into 450+ groups with 750 DRGs. Fees are assessed by factoring the body system and groups affected, with the amount of hospital resources required to treat the condition. MS-DRG offers more precise diagnosis by dividing DRG into a tree-tiered system:
    • (MCC): Major complication /co-morbidity.
    • (CC): complication /co-morbidity
    • (non-CC): no complication/co-morbidity
  3. Infection and Ortho procedures have the highest average total payment: Looking at the scatter plot below you will find the average total payments by DRG are clustered in 4 groups. At  $14,240 and $14,825 average total payment, infection and ortho are the most expensive procedures, . Circulate, digestive and respiratory related procedures have lower average total payment at around $9,500. Kidney, Nerve and Toxic expenses are much lower at around $7,700, followed by Metabolism and other related procedure at around $6,400. Infection and Ortho related procedures are obviously outliers in terms of total payment. Possible reasons are as follows:
    • Infection can be as serious as sepsis, which may lead to multi-organ failure or even mortality. Infection especial hospital-acquired infection often require much longer length of hospital stay and therefore higher total payment.
    • Orthopedic related procedure total payment often includes implant devices, which could be a big chunk of total payment. Take Total Knee Replacement (TKR) as example, the implant cost as largest associated cost could account up to 87% of the total procedure payment by DRG
  4. Infection has the much higher medicare reimbursement rate: Infection related procedure has highest average Medicare reimbursement rate at 86.9%. This phenomenon is the consequence of Hospital-Acquired Condition (HAC) Reduction Program. When patient get hospital-acquired infection, which is largely avoidable, Medicare will take over total payment, so  there are no out-of-pocket costs for beneficiaries.medicare reimbursement rate by DRG
  5. Ortho is the most expensive procedure for Medicare beneficiaries in terms of co-payment: Ortho patients will have to pay $2,367 directly out of their pocket, while other procedures requires no more than $1,500. As the reason mentioned before, implant devices cost is the biggest chunk of total cost.outofpocket by drg

Explore relationship between Medicare payment and service quality:

  1. 1-star hospital should definitely be your LAST CHOICE: Hospitals with the lowest quality rating usually charge the most, on average ($12,554), and they has the highest medicare reimbursement rate (86%).  Failed services that correlate with  higher infection rates likely account for the highest Medicare coverage rate. Despite the  high Medicare reimbursement rate, beneficiaries still have to pay the same out of pocket as they pay for the better quality service. Thus, there is no quality and cost benefit in selecting a  1 star hospital.. A  2-star hospital may be considered for an emergency if there is no better hospitals around you; otherwise it is not recommended due to quality of service.
  2. 3,4-star hospitals might be the best choice for Medicare patients with limited budget: 3,4 star hospitals can provide Medicare patient average or above average quality of service and also charge an affordable out of pocket payment. The average total payment  for those hospital are $9,800 - $9,900.  They also get a good reimbursement rate (83%).  
  3. Premium option - 5-star hospital: 5 star hospitals provides patients the best service but also charge the most. The average out of pocket payment is much higher than any other types of hospital, but, in this case, the higher cost does correlate with better service. Top-ranking  hospitals, such as Hospital for Special Surgery and Mayo Clinic, continuously invested in quality management and acquired most sophisticated equipment and best providers to guarantee patient safety and maximize outcome. However, since there are only handful (82) 5-star hospitals in U.S. , patients may need to wait and travel across states to receive best services. Thus, 5-star hospitals are an option Medicare patients with the means to pay for them and whose health issues don’t require  immediate attention.
  4. reimbursement rate by hospital rating

Project Scope and Deliverable:

  • Key UI Features and Design Philosophy:user interface explanation
  1. Find the best hospital: This is the brain of the application. Medicare patients can select the input, such as DRG, location, distance and quality preference, and the 'Brain' will pin down qualified hospitals and recommend better quality and relatively lower cost hospital.
    1. Choose your DRG:  The first and most important input should be 'What type of procedure you want to get?' The drop down list provides the 100 DRGs for users to choose. However, the design has two limitations:
      • Patients are not familiar with DRGs: MS-DRG system was originally designed for reimbursement purpose. Users without medical knowledge likely would  get confused. It is definitely necessary to design questionnaires with plain language to help users find best described DRG.
      • Only partial DRGs were listed: Only the 1oo most frequently used DRG were listed. Patient might not able to find fitted DRG. As CMS updated the dataset, more DRGs will be added.
    2. Your location & search hospital within X miles: This feature is designed to meet the demand of patients with emergency situations. For instance, if a patient was bone fractured in lower Manhattan and need emergency care, one can readily narrow down options to only 3 hospitals by searching within 3 miles to filter out all the other distractions. To further develop this feature, I will add with Emergency / Non-emergency situation button to enable search style switch.
    3. Select hospitals with X stars of quality care: According to the previous analysis, 1 and 2 star hospitals are not recommended. Users can filter out the lower quality hospital to remove them from consideration altogether. One with a limited budget can also filter out 5-star hospitals to avoid unaffordable out of pocket payment.
    4. Cost and Quality comparison:  The reason I am calling this application a simple recommendation system is because the recommendation mechanism is based on a simple norm that patients always want better service and lower/affordable cost. The methodology used is pretty simple: Divide the two-dimensional plane (out-of pocket payment / hospital quality index) into 4 quadrants using cost median and quality median. To further differentiate hospital quality, I added another 6 sub-index upon the overall quality rating to generate a new quality score with range up to 100. Here is the equation:equationquality_cost
    5. Markers & hover on Popup: The recommended hospitals will be shown on leaflet map using green markers. The dark green marker represents hospitals with better quality service and lower price. The light green marker represents better service but higher cost. Unrecommended hospital will be shown in orange or red markers. The popup window will present detailed information, including 'Total inpatient discharge from 2011 to 2014,' 'Average total payment,' and 'Average medicare payment.' With this information, patients will be able to gain a comprehensive image of a hospital regarding a specific DRG.
  2. Explore medicare: The playground for users who want to explore Medicare data by themselves.
    1. Map: The map offers users a lot of  flexibility with option to  change color  and size variables. The inspiration came from the 'SuperZip' example by Joe Cheng.
    2. Medicare payment analytics: This is a 2-dimensional scatter plot with points stratified by hospital types. Users can change X,Y axises to different cost variables. Regression analysis is also attached to help patient understand cost differences among different hospital types.
  • The Delivered Product:

delivered product


  1. NYC Data Science: Shu Yan
  2. Inspiration from the ‘SuperZip’ example by Joe Cheng


[1]: Hospital Billing Varies Widely -- But Quality Has Nothing To Do With It

[2]: Why Nonprofits are the Most Profitable Hospitals in the US

[3]: Only about 2 percent of the nation's hospitals get 5-star quality rating

Data Source:

  1. Medicare Provider Utilization and Payment Data: Inpatient
  2. Hospital Compare Datasets

About Author

William Zhou

William Zhou is quantitative thinker and deep learning enthusiast with a strong background in healthcare. After graduating from Soochow University in Pharmaceutical science, he obtained a MHA from Columbia University in Healthcare management. In the following 2 years,...
View all posts by William Zhou >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI