Data Helps to Find the Best German Health Insurance Company
Data Science Background
Germany has 110 different statutory health insurance companies with different premium rates and different services features. Finding the best health insurance that is available in the county you live in may include a lot of research and data collection. My goal for the web scraping project was to develop an app which would do the job for you.
The German news magazine Focus provides a list of 75 health insurance companies that are open for every one who lives in the county they provide service as well as a tool to compare them. Each health insurance leads to a new website giving information about where the insurance is available, service features, bonuses. Which dental services are included as well as a couple of other features. The main website was scraped using Scrapy to get the name of the insurance company, the premium rate and a list of the links for the extra information.
Giving the time limitation the main focus to differentiate the provider were bonuses and naturopathy. Bonuses include:
- cancer screenig
- skin cancer screening
- yearly dentist visit
- no smoking
- gym membership
Naturopathy has the options of:
After scraping the websites a column with the number of insured persons was added and the scraped data was cleaned and simplified. Each bonus and naturopathy column was converted to TRUE or FALSE.
The smallest insurance company is BBK MTU with 4.288 member while the Techniker Krankenkasse covers 9.937.314 members. The so called "Beitragssatz" (premium rate) is laid down at 14.6 percent – half of that (7.3 percent) is covered by the employer and 7.3 percent plus any additional higher premium rate is deducted from the gross income. VIACTIV Krankenkasse and SECURVITA Krankenkassen have the highest rate with 16.3 percent. The Metzinger Krankenkasse has currently the lowest premium rate with 14.6 percent and covers none of the selected bonuses or naturopathy. SECURVITA Krankenkassen, IKK Südwest and IKK Brandenburg & Berlin cover all of the selected.
73 out of 75 insurance companies give out a bonus for skin cancer screening, while only 40 provide a bonus for check-ups. In naturopathy, Osteopathy is the most frequent covered feature (63 out of 75) and Ayurveda is only covered by six.
The App lets users select one of the 16 counties in Germany – their place of residence and takes their income to calculate the premium rate they would pay. Underneath the user can select seven different bonuses and five naturopathy depending on if they want that to be part of their insurance plan. The app can be found here.
Data Science Conclusion and Next Steps
You may think the premium rate increases as an insured person selects more service features, but due to the fact Germany has a law that everyone has to be insured and that health insurers are not allowed to refuse anyone. The rate mirrors the collective health of the members as well as the economy of the insurance company. Unfortunately the data of the insured is rarely made public otherwise there would be the possibility to draw conclusions from the features to the rate to see which feature is the most expensive or least profitable. Another problem was that the number of members isn't accurate some insurance companies had an updated number while others had numbers from 2016.
Next step would be to include more features to select as well as being more precise about the feature. To simplify the project the features were divided into True or False but some insurance companies may cover only 80 percent instead full coverage for certain non-necessary features.
The code for the scraping and the shiny app can be found here.
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.