Fish Communities In Florida, Surveying the Coral Reef

Posted on Apr 28, 2019

Project GitHub | LinkedIn:   Niki   Moritz   Hao-Wei   Matthew   Oren

The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

As a student who is very passionate about wildlife conservation, I wanted to visualize how fish populations have changed over recent years. I found the National Oceanic and Atmospheric Administration (NOAA) conducts a Coral Reef Monitoring Program, which records data from coral reef sites every two years. The particular dataset I chose to work with specifically surveys the coral reef tract of the Florida Keys.


This dataset contains observations taken in 2014 and 2016 of coral reef fish communities located in the Florida Keys Reef Tract. Each observation records many attributes of the fish seen and the habitat where it was found. Measurement stations were divided into 3 main regions: Dry Tortugas, Key West to Miami, and Miami to North Martin County. Within these regions, 20 different coral habitat types and 406 different fish species were observed.

I have chosen a few attributes of this dataset to explore, including region, habitat type, depth of the observation, and number of fish seen during the observation. My goal was to find changes in these attributes from 2014 to 2016, with expectations of less fish seen overall and more fish by percent of total found at greater depths in 2016 due to climate change.



Measurement stations located in the Dry Tortugas region.


Measurement stations located in the Key West to Miami region.


Measurement stations located in the Miami to North Martin region.



When I initially saw the decrease in number of fish seen from 2014 to 2016 I was extremely concerned. Only 62% of the number of fish seen in 2014 were seen in 2016. However, further investigation revealed there were many less observations made in 2016 than in 2014. The ratios of number of fish seen per observation each year are much closer, though still show a decrease from 2014 to 2016 (0.666 and 0.590, respectively).

Since there were 406 species in this dataset, I decided to focus only on the 12 most common species in each region and see if any of them contributed to the overall decrease in fish seen per observation.

These graphs show the number seen of the 12 most common fish in each region, and then by year. A few interesting things are seen here. In the Dry Tortugas region, the number of masked goby seen is 150,000 more than any of the other most common fish. This same trend does not occur in the other two regions, and we also see a very drastic drop in number of masked goby seen from 2014 to 2016.

I assumed at first that this must have been an input error in the data, and I would perhaps find one or two observations with an extremely high number seen in comparison to the other masked goby observations. However, this was not the case. The number of masked goby seen per observation follows a steady downward trend in 2014 and even the highest number seen for one observation seemed to be recorded accurately.

In further work I would like to research this more, as just the decrease in masked goby from 2014 to 2016 accounts for a very large portion of the total decrease in fish seen. If possible, I'd like to try to find out what the population of masked goby in this region looked like prior to the start of this survey in 2014, to determine if there was an initial spike in this species, rather than such high numbers seen per observation being the norm, and what may have caused it.


Last, I wanted to look for a change in the depths that observations were taken by species. I had made an initial guess that with rising water temperatures, each species might be more commonly found at greater depths in 2016 than in 2014 so that they could remain at a constant water temperature. However, I ran into an issue with the data when exploring this question.



When looking just at number of fish seen per by habitat per year, we see a massive decrease for some of the habitats from 2014 to 2016. At first glance, this could appear to support my initial guess that fish would be more commonly found at greater depths in 2016, and therefore leaving behind certain habitat types found at lesser depths. However, I found a significant change in the pattern of observations taken from 2014 to 2016.


Here we see from 2014 to 2016, there was an increase in observations taken at lesser depths and a decrease in observations taken at mid-range depths. It is unclear from the data or information provided with the data why there was such a drastic change in the depths or habitat types that observations were taken at. First I thought, maybe the researcher got lazy in 2016 and stopped going to greater depths, but the ratio of observations at greater depths nearly doesn't change at all. In further work, I would like to research what possible driving factors led to this unexplained change in observation patterns.

Thank You!

Thank you for taking the time to read about my analysis of the National Coral Reef Monitoring Program: Assessment of coral reef fish communities in the Florida Reef Tract! If you would like to see my corresponding interactive web app, please visit the link below!

About Author

Kat Kennovin

Data scientist with a quantitative science and analytical background. Strong communication skills driven by multi-team collaboration work experience, a team-player mindset, and the ability to simplify complex problems.
View all posts by Kat Kennovin >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI