Top 200 Common Passwords of 2020

Posted on Feb 3, 2022

Introduction

  • Analysis of top 200 common passwords of 49 countries in 2020

Questions

  1. How many users in each country were found to have used passwords on the list
  2. What is the average password length by country?
  3. What is the average time to crack the password by country?
  4. Of the unique passwords what were the top 5 most used words of 2020?
  5. How complex were the passwords?
  6. Was there a big difference in time between cracking a complex password and a non-complex password?
  7. Do passwords get easier to crack over time?

Part 1. Comparing Passwords by Countries

Finding the best, worst, and everyone in between

How many users in each country were found to have used passwords on the list in 2020?

On average, 34,685 people used the passwords on the list. Russia has the highest at 146,837,497 users while Estonia had the lowest at 169,656 users

As a business owner or as an IT professional, knowing how many users of each country are using a common known password may be beneficial in order to make password security related decisions or help determine a starting point for further investigation.

For example, if you had a company located in Russia and another located in Switzerland, you can tell which location is more vulnerable based on the amount of people using a known password and properly allocate funds to that location to allow for employee security training and/or research into seeing if there's any particular reason why more users in this location were found to be using the known passwords than anywhere else. Perhaps the most used website in that country was breached or people have a habit of using a default password

 

 

What is the average password length by country in 2020?

Vietnam had the highest average length at 7.87 while Korea had the lowest at 6.54. Overall, the average of all the countries was 7.11

Finding the average length of the known passwords each year could help determine a trend over the years. For example, if you have found that over the past few years, the average length of the leaked passwords has been increasing, you might also consider increasing the minimum password length for your company since the trend may be an indicator that the technology we have today has gotten a lot better at cracking longer passwords


 

What is the average time to crack the password (in seconds) by country?

Just as a disclaimer, the Time to crack the password is an estimation using the brute force method and not the actual time it took to crack the password as the actual time can vary depending on each attacker's skill set, resource and many other things.

It's very important to understand that this method shows the max time it would take to crack the password and the attacker can potentially crack the password much faster. So in this report you will see very high numbers but it doesn't mean that it's something we don't have to concern ourselves with

Just to explain the brute force method a bit further, it's basically a method where the attacker tries to find the password by trying every possible combination. It's like when you are trying to guess the combination of a numerical lock, and you start from the lowest number and work your way up to the highest.

That being said, the country with the highest average time to crack the password in 2020 is India at 4,193 Millenniums and Slovak republic at 1.9 days

Part 2. Top 5 Unique Passwords

Since the dataset contained duplicate passwords since it is the common passwords of multiple countries, I decided to proceed with the research using only unique passwords.

Of the unique password list, these are the top five most used passwords in 2020. I was actually surprised that the super well known passwords like 12345 or asdfg didn't make it on this list.

  1. target123 - 713,609 users
  2. 1g2w3e4r - 713,360 users
  3. gwerty123 - 713,241 users
  4. zag12wsx - 710,667 users
  5. tinkle - 69,940 users

Part 3. Password Complexity

After I filtered the list to find only the unique passwords, I created a category to determine how complex the password is in order to see what kind of passwords were leaked in 2020

Breakdown: What makes a strong password?

In order to determine the complexity of a password, I dissected the password to see if it contained an uppercase, lowercase, special character and if the length of the password was over 12.

For a while, I considered using length of 8 as that seems to be the recommended length online. However, I've worked for a few companies that required the minimum length to be 12 so ultimately I decided that a strong password should be over 12 characters in length.

 

Of the unique passwords on the list in 2020, how complex were they?

I found that 84.3 % of the passwords had a complexity of one. which means that they contained only one character type such as being in all lower case or all uppercase.

Of the unique passwords, majority of the passwords were all in lowercase. The most second common complexity 1 password were in all numbers.

If a company is currently allowing users to create passwords that is all lowercase or just has one type of character. They should definitely reconsider to see if they should change the password requirement

As a side note, I found 2 passwords that used the Russian alphabet, I found that one of the passwords was just the Russian word for password

Of the unique passwords with a complexity of 1, which character type took the longest to crack?

Passwords that were solely numbers took the longest time to crack in 2020.

The average time to crack a lowercase password was 181 days, uppercase 8,371 days, and numbers 52,920 days

Businesses really should consider moving away from allowing people to create passwords where it only contains 1 character type. But I understand it's possible they purposefully choose to not make strict passwords requirements for a specific reason like making it more user friendly to their target audience

If this is the case, businesses should take note of the hierarchy of this year and compare it to the previous years before making a decision

  1. lowercase: 15,655,985.54 = 181 days
  2. Upper: 723,298,368.33 = 8,371 days
  3. Numbers: 4,572,306,369.11 = 52,920 days

Relationship between complexity and mean time to crack (PC1 vs PC4)

I also wanted to find out was if there was any relationship between how complex an password is and how long it would take to crack that password.

Since a password with a complexity of 5 wasn't found in the dataset, I compared the average time to crack the password with a complexity of 1 and compared them to the average time to crack a password with a complexity of 4

To crack a password with a complexity of 1, it would take on average 3453.9 days while it would take 209,596 millenniums to crack a PC4

 

Part 4. Passwords Over Time

Of the unique passwords in the list, has the crack time changed from the past years?

Lastly, I thought it would be important to see if there is a significant change in the password crack time over the years

In order to see if the Average time to crack is affected by time, I analyzed how long it took to crack the same unique passwords in the dataset over the past 5 years.

As we can see over time, passwords are cracked faster due to technologies evolving over the years.

While it took an average of 3642.84 centuries to crack this password set in 2020, it would have taken an average of 5472.56 centuries to crack the same passwords in 2015 which is about 50.23 % difference in speed

Part 5. Future Work

If I had more time and data, I would love to see how the average password length changes over time. The reason I couldn't analyze the change in length over years in this project was because I was unable to find data from the same source for years before 2019.

I also wanted to see if there is any correlation between top trending words and hashtags and see if I could predict what passwords would show up in next years list.

About Author

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI