Investing in Colorado's 14ers

Avatar
Posted on Jul 28, 2019

Hiking Colorado 14ers is one of the most popular summer past times in Colorado. Every year thousands of people hoping to get away from civilization and partake in a beautiful mountain sunrise emigrate to one of the 53 (58 depending on who you’re talking to) 14ers Colorado holds. Some of these 14ers can be as easy as a walk in the park and some should only be attempted by experienced hikers. 

In this analysis I looked at many different aspects of the 14ers. All of the data scraped originated from 14ers.com, a prevalent site used by most hikers. Route data, mountain data, and weather data were the primary sources of information for this project. I directed my analysis of the data towards the outdoor industry. There are many opportunities for outdoor companies to get involved in trail restoration. This data could also be used to glean insights into what items will be needed by hikers and where (weather data, length of routes, etc). 

Working exclusively with BeautifulSoup I scraped beginning with high-level mountain data which included the class of the hike, the mountain name, and the elevation above sea level. I then cleaned the data returned from the page. This involved building a regex in order to determine if the mountain was, in fact, a 14er. 13ers are also widely popular in Colorado but outside the scope of this project. I then took the data and segmented it in order to be able to parse out each mountain independently. 

 

This process was followed for scraping route data and weather data as well. Each 14er may have multiple routes to climb in order to reach the summit. This was taken into account later in the analysis. Each mountain forecast provided five days and nights of weather data. The format provided by the site was not in a traditional datetime format and required manipulation in order to determine the exact date for the weather on any given run of the scraping code. 

In order to pull the raw data I looped through the route URL for each mountain and returned a list of the forecast for each mountain for the next five days.

 

The data returned needed to be cleaned significantly and assigned a real date as the returned data only provided a string for the forecast as seem below:

In order to clean the dates, I built a conversion dictionary which provided a key and value to add n-hours to the index of the returned weather. For example, the weather always begins with Today’s High and continues in 12-hour intervals. Therefore each index is approximately 12 hours later than the previous index. You can see here how it’s implemented:

The result is a dictionary with a mountain name as the key and tuples with the weather as the values:

This result not only allows me to later use actual dates in any analysis I might do but also to run this on a consistent schedule and build a history of weather on actual dates. 

From here I moved from scraping to my Jupyter notebook for data exploration and initial analysis. Since my main focus here was to scrape and clean data my exploration is only scratching the surface of what is possible. 

In order to call my data from the previous file I called the function and assigned the result to each dataset respectively:

 

I started with some basic questions:

 

I also used Seaborn to visualize some of the data.

 

I used some of the data to look at the weather patterns:

 

Overall the project allowed me to begin scratching the surface of what’s possible. This data can easily be used to help understand where the best opportunities are for improvement on the trails and which trails are still experiencing harder weather. This, in turn, can help companies decide not only what to stock but where to stock it. 

   

Further research:

  • Detailed data on which routes and mountains are most frequented by hikers of different skills during different times of the year. Possible data here.
  • Data on how much each route is in disrepair. Possible data here.
  • Where most accidents are happening on different mountains. Possible data here.
  • Historical and present weather data in order to better understand long-term inventory. Possible data here.

 

About Author

Avatar

Katherine Treadwell

Analytics professional passionate about empowering teams to make well-informed, data-driven decisions. Proven leader and strategic partner in developing long-term plans for success. Strong foundation in technology and business applications. Teacher, developer, and mentor of team members and the...
View all posts by Katherine Treadwell >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

2019 airbnb alumni Alumni Interview Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Big Data Book Launch Book-Signing bootcamp Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Industry Experts Job Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest recommendation recommendation system regression Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Tableau TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp