Six Tips on Building a Data Science Team at a Small Company

Posted on Dec 18, 2020

When a company decides that they want to start leveraging their data for the first time, it can be a daunting task. Many businesses aren’t fully aware of all that goes into building a data science department. If you're the data scientist hired to make this happen, we have some tips to help you face the task head-on. 

Tip #1: Break down the most important deliverables in the company. 

Being the only data scientist at a company is tricky. You may be expected to be the expert at everything data or code related. A good starting point is to break down the most important deliverables in the company. Understanding these deliverables and deconstructing them until you can lay out the most important data sources and processing steps is important in understanding exactly what needs to get done for this company. 

Tip #2: Utilize project planning practices

Staying organized is one of the most important aspects of building a successful team, but you don’t have to reinvent the wheel. There are many project planning practices that can help provide structure to your data processes. For example, The Data Science Hierarchy of Needs is a great resource for staying on track and organized during this planning process. 

It would be nice to deliver AI solutions for your company right away, but there are many foundations that need to be set in place before that can realistically happen. The Data Science Hierarchy of Needs, and other project planning tools like it, can help you structure a sound, sustainable path to your company’s data science goals.

Tip #3: Report wins along the way

As the first data scientist, you can realistically expect that your non-technical colleagues will not understand your work and all the effort that goes into it. Therefore, it will be on you to report wins along the way towards deploying your first data model. This will ensure that your company stays up to date with your progress, and build trust in your ability to build and deliver.

For example, a reliable data flow will be the cornerstone of the productivity of any data team. It’s a foundational part of the pyramid and it will empower you to swiftly tackle a variety of problems. While the non-technical decision makers at your company will mostly be concerned with the analytical results that you eventually derive from working with a reliable data flow, setting up that flow is no small feat, and a huge step in the path to getting those results. You should take the time to report that step to your team, and make them understand its importance in the larger process.

In doing so, you’ll prove to your team that you can consistently make progress towards your goals.

Tip #4: Utilize data visualization methods

Data visualization is often overlooked. It will prove to be one of the most important tools in your data science toolkit. Good data visualization is all about practice. 

An exercise to do before going to a meeting with stakeholders is plotting something and asking yourself questions that may arise among your audience. After that, make adjustments to the plots and then ask yourself questions again to see how well the graph addresses problems. 

This seems simple and straightforward but it is often a forgotten step. It is important in the preparation process and your bosses will be impressed when you have solid answers to all their questions. 

Think of data visualization as a tool to communicate the value of your work. It makes a huge difference for non-technical people to understand all you’re trying to convey. Ultimately, it is key for communicating and selling your work outside of your team. 

Tip #5: Start your machine learning with a stupid model

When it comes to machine learning, which may not be the priority at the beginning, always start with a naive model. By “naive model” we mean a simple model, just to get something that works end-to-end.

From there, you can work on tuning and improving, which will be a much easier process once you already have something that works.

You will find the most solutions can be deployed once 80-90% of the problem has been solved. Spending time and resources trying to get that last 10% will not be the data science problem but rather a management problem. 

Tip #6: Manage expectation like a magician

Many people think data science is like magic and you are the magician. It’s important for you to manage these high expectations so that you can deliver on time, and avoid drowning in your workload and dragging out deadlines.

You can manage expectations by planning ahead, staying on task, and always having the end goal in mind. Doing this and staying organized will make sure that your higher-ups are always impressed with the work you’re doing.  

Starting a data science department is a big task, but as large as it is, it is also rewarding and fulfilling. 

In a recent webinar, NYCDSA Bootcamp alumni, Raul Vallejo, expands more on how he built a data science department at a small company. Through this, he gives insightful advice from first-hand experience and answers audience questions. 

About Authors

Raul Vallejo

Actuary, statistician and certified Data Scientist. Experienced in building risk models and integrating them into a company-wide modelling strategy. Leader of new multi-department initiatives to create data-driven culture. Raúl Vallejo completed his BA in Actuarial Science at Instituto...
View all posts by Raul Vallejo >

Leave a Comment

Six Tips on Building a Data Science Team at a Small Company – The Open Bootcamps January 4, 2021
[…] Original. Reposted with permission. […]
[…] Original. Reposted with permission. […]

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI