How to Build a Data Science Portfolio That Will Get You Hired

Pranjali Galgali
Posted on Jul 8, 2019

When it comes to breaking into data science, your skills are your foundation and you must be able to demonstrate them.

That’s where a strong portfolio comes in.

With demand for data science skills on the rise, there’s never been a better time for skilled bootcamp students to land roles in data science, data analytics, and machine learning. And a strong data science portfolio can be your best tool to stand out and catch the attention of hiring managers.

In this post, we’re sharing how to create a data science portfolio that will help get you hired. We’ll outline important skills and competencies you should demonstrate, share our tips for selecting a compelling project, and even list some real-life portfolios that check all the boxes.

What employers want to see + skills you should demonstrate

Before we dive into what makes a successful portfolio, we’ll start with detailing the languages, skills, and experience that employers are seeking. It’s important to understand the competencies that will transfer to your career in data science.

These are some of the areas you should seek to demonstrate in your portfolio:

Popular Technical Tools

  1. Python
  2. R
  3. Hadoop
  4. SQL
  5. Git
  6. Linux
  7. SAS
  8. MatLab
  9. Hive
  10. Pig
  11. Spark
  12. Ruby
  13. C++
  14. Perl

Must-have Foundational Skills

  1. Spreadsheet tools (Excel)
  2. Database systems (SQL and NoSQL)
  3. Distributed computing
  4. Data visualization
  5. Predictive modeling
  6. Math
  7. Statistics
  8. Machine learning

Experience

While employers are looking for your portfolio to evaluate the skills you've acquired in a data science bootcamp, many will also want applicants to have earned higher education credentials.

Either way, a portfolio is a great way to showcase what you’ve learned in a bootcamp or demonstrate how you’ve used a bootcamp to build upon a previously earned advanced degree.

Creating your portfolio

Our biggest tip for creating your portfolio is to select a practical project that can be applied in everyday life. In addition to being relatable, your project should illustrate a well-rounded set of technical skills. So, while it can be tempting to focus specifically on modeling, make sure you’re demonstrating good coding practices (include a readme & documentation), presentation skills, and good business sense. You should also consider using version control to show familiarity with the concept.

What’s a relatable topic? Here’s an example.

Consider Google, Amazon, and Facebook as examples. If we’re looking at these three companies as portfolio concepts, Google is simply a tool that helps people find things they're looking for, Amazon helps users find goods conveniently, and Facebook lets users chat with others.

These companies predominantly offer a practical service that is widely usable. Everyone uses search engines, shops for goods, and wants to stay connected to friends and family. Because of their universal appeal, Google, Amazon, and Facebook have become incredibly popular and experienced impressive longevity.

Your data science portfolio projects should take the same approach.

Rather than designing a self-driving car and using fancy models to demonstrate their limited application, consider something with a wider appeal! How about an app that tracks a user’s time spent in traffic? You could answer questions like “which days have the heaviest traffic?”, “which routes have the lightest traffic?”, or “what is the best time for me to begin my commute?” etc. Now that would be easy for users to connect with!

Here’s a recap

  1. Select a practical project
  2. Illustrate a well-rounded set of skills, including:
    • Coding Practices
    • Presentation skills
    • Business sense
    • *Consider version control

Example Portfolios

Now that we’ve touched on the skills your project should demonstrate and our tips for selecting your project, we are excited to share some successful NYC Data Science Academy portfolios.

The following projects demonstrate a strong understanding of concepts, as well as clearly defined audiences, goals, and results accomplished.

More than that, they are engaging and tell a good narrative that can be readily understood. It is important that portfolios be understandable by a less technical reader, while also incorporating technical explanation for the more academic-minded readers.

Check out these impressive projects:

    • Mikhail Stukalo - Shiny App for asset allocation backtesting

      For his project, Mikhail created a prototype of portfolio allocation backtesting. His goal was to help asset managers evaluate the performance of a client’s portfolio, and explain the key principles of the modern portfolio theory. This project also provided investors with a tool to evaluate the historical performance of different combinations of asset classes and allowed them to compare their portfolio to an optimal portfolio.

    • Fangye Shi - Forecasting Cryptocurrency Price Trends

      Fangye's project is useful in evaluating the price trends of Bitcoin, as well as the frequency of keywords in the news. Notice that this project illustrates the period from late 2017 to early 2018 when the price of Bitcoin rose steeply, and then suddenly collapsed.

    • Henry Crosby - Improving a Music Website's User Experience

      Henry's project was a collaboration with a local startup, where he consulted them on their data science needs. What made this project great was that it had demonstrated, real-world application. It required multiple team members to complete, and Henry did a great job mapping out the end to end process. He included some of the details that often go unnoticed in the data science scene such as databases, discussing business goals, and the team collaboration process.

    • Josh Vichare, Jake Ralston, Olga Yangol and Igor Gitlevich - Using Machine Learning to Build a Predictive Model for House Prices

      This project is a great example of how to tackle a machine learning task in a streamlined manner. It touches upon a common topic, illustrates the mathematical principles on a high-level, and uses industry-standard models.

      Check out this user-friendly plot our students created to illustrate home sale price per neighborhood. This is what we mean when we say content should be understandable by a non-technical reader!

ALT TEXT

    • Kelly Ho, Silvia Lu & Zhenggang Xu - Fashion Rec.

      Kelly, Silvia, and Zhenggang's project is another great example of a portfolio that uses an understandable model. Their project was essentially a "clothing recommender system" that evaluated a user's Instagram subscription, and a specified price range to suggest clothing. They used web-scrapping to obtain product details from Macy’s, Bloomingdale’s, H&M, ASOS, and Fashion Nova, as well as the likes, comments, and posts from 13 Instagram influencers. They clustered the data using a Natural Language Processing technique called Latent Dirichlet Allocation (LDA), and AWS’s Flash App.

ALT TEXT

    • Bruce Chyi - Starbucks Around The World

      Bruce comes from a CS background and while his blog writeup is somewhat sparse, the code for his project is concise and well written. He demonstrated good habits including commenting, well defined variable names, and a good walkthrough of how to handle messy data from multiple different sources in an organized manner.

More about NYC Data Science Academy

Here at NYC Data Science Academy, we provide technical and strategic training on full- and part-time schedules. We offer educational, training and career development services dedicated to delivering a wealth of experience in data science - giving us a great insight into what a successful portfolio looks like. Read more about us, and check out our reviews on SwitchUp!

By the end of the program, NYCDSA students complete at least four real-world data science projects like the ones above. These projects showcase students’ knowledge to prospective employers.

In addition to projects, students also participate in presentations and job interview training to ensure they are prepared for top data science positions in prestigious organizations.

93% of NYC Data Science Academy students are hired within six months of graduation, and alumni are currently working at companies like the ones seen below.

It’s impressive, we know!

ALT TEXT


There you have it! Now that you understand how to create a successful portfolio, and you’ve checked out some strong examples, you’re ready to get started.

So, what’s your portfolio idea?

Because the demand for data scientists is on the rise, and the supply of talent can’t keep up, ‘data scientist’ is one of the top tech jobs of 2019! If you’re ready to dive into data science and get help navigating the portfolio creation process, NYC Data Science Academy can help you get there. Don’t believe us? Check out our awesome reviews on SwitchUp - see for yourself!

About Author

Pranjali Galgali

Pranjali Galgali

Pranjali Galgali, Marketing and Communications Associate, NYC Data Science Academy
View all posts by Pranjali Galgali >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

2019 airbnb alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp