Breaking into Data Science in 2019
Lukas Frei works as a Data Science Consultant at PwC Germany, helping companies implement data science solutions in tax, legal, and finance. He has a Bachelor’s in Business Administration from the University of Mannheim, a well-known business school in Germany, and he is a graduate of NYC Data Science Academy. His passion for data science first began when he was studying abroad in Shanghai, China. As part of NYC Data Science Academy's 15th cohort, he worked on four major data science projects, including a data science consulting engagement project in New York City.
Here’s a window into Lukas Frei’s journey from business to data science.
I remember thinking about breaking into data science as if it were yesterday. I had just started my semester abroad in Shanghai and attended several talks and guest lectures about data science and machine learning. However, I had never coded before that point (except for some basic SQL) and I did not really know where to start. Initial web searches resulted in more confusion than insights because there were so many different recommendations for paths to take to become a data scientist. Some sources even suggested that you should not attempt to become a data scientist unless you already have an analytical background.
This article takes a different approach. I am not going to provide a one-size-fits-all path into data science. Instead, I am going to elaborate on my experience of trying to transition to a career in data science. I hope this will help encourage aspiring data scientists, regardless of their backgrounds.
Part 1: Learn How to Code
My very first step was starting to learn how to code. To get a deeper understanding of the data science industry, I decided to enroll in NYC Data Science Academy’s 12-week Data Science Bootcamp. The program includes an online pre-work course that includes online materials, coding challenges, and assessments in Python, R, and SQL, in addition to other languages. The pre-work course has to be completed before starting the bootcamp, and it helped me get a solid foundation which enabled me to succeed in the bootcamp.
Code, Code, Code
Programming is a skill that is acquired by consistent, repeated practice. There are a lot of great books that can help you get started. The ones I used were Learn Python The Hard Way and R for Data Science. Since I had already used SQL, I only had to review the main commands. If you would like a more comprehensive guide for SQL, I would recommend Mode Analytics’ SQL course.
While learning to code, you will run into many challenges. Keep going. Pushing through and examining your mistakes will be very valuable later on. When practicing Python in Jupyter Notebooks, I documented all of my mistakes so that I would be able to review them. This resulted in a personalized library of code snippets and interesting discoveries that I utilize to this day.
Python or R?
There are many factors to consider when deciding which programming language to start with. If you want to have easy access to a wide variety of tools for statistical analysis, then R is probably the way to go. If you want to learn a more widely-applicable language, Python should probably be your first choice. In my experience, picking a language and just starting to code is what matters most. The best way to find out what language you prefer is by playing around with code, instead of relying on third-party recommendations.
If you want to be extremely versatile, I would recommend learning both Python and R. Luckily, NYC Data Science Academy teaches its entire curriculum in both languages. I personally prefer using Python for machine learning, but I appreciate how easy it is to do data analysis in R using the tidyverse, a collection of R packages designed for data science.
Part 2: Brush Up On Statistics
As a business major, I had taken an elementary statistics course in college, as well as some economics and finance courses. In my opinion, especially in 2019, only knowing how to use machine learning packages, such as scikit-learn, is neither enough to effectively practice data science nor will it be enough to land you a job in data science!
Document Your Progress
In order to organize everything I would need to know, I started creating Word documents with summaries for each topic. There are so-called “Cheat Sheets” readily available online, however, I usually find them to be lacking in depth. As I emphasized earlier, there is no one-size-fits-all solution, and you may prefer to build a customized data science look-up library. I took notes during the lectures at NYCDSA’s bootcamp and refined and reviewed them at night. While this took a lot of effort, it facilitated my understanding of more complex algorithms as the lectures progressed.
Master the Fundamentals
As a final note on this topic: do not, under any circumstances, skip the basics. While trying to jump to fancy algorithms might seem tempting at first, spending the majority of your time understanding the fundamentals is a better choice. In addition to the lectures, I read several books about statistics and statistical learning. In my opinion, the best book on statistical learning is “An Introduction to Statistical Learning: With Applications in R” by Daniela Witten, Robert Tibshirani, and Trevor Hastie. Different books take different approaches. Thus, combining books that focus on a verbal explanation of algorithms with books that dive into the technical details proved to be a good investment of my time.
Ask Many Questions
If you choose to attend a bootcamp as I did, make use of your instructors. Ask as many questions as you can. Do not wait until you run into serious problems to start asking questions. NYC Data Science Academy has great instructors and they are available for you at all times. The Slack channel, TA support, and peer support helped me a lot. Even showing more experienced people your code and asking for ways to improve its efficiency can prove to be extremely useful.
If you are not able to get professional help, do not despair. There are plenty of online communities and resources that will help you answer your questions. Chances are, you are not the first person who has encountered your problem.
Other Things to Brush Up On
Depending on your background, it might be a good idea to review basic linear algebra and calculus. I would recommend either going through your old linear algebra and calculus notes or taking an online course with NYC Data Science Academy. This is particularly important if you are interested in reading academic papers and technical literature.
Part 3: Build a Project Portfolio
This third step is of utmost importance if you want to land a job in data science. Apply what you have learned by trying to complete at least four major projects. This will help convince potential employers that they should hire you.
If you attend the NYC Data Science Academy bootcamp, you will complete three projects and a capstone project. These projects will cover everything from data acquisition to data visualization to machine learning. The capstone project enables you to choose a topic of interest to you. You should use this opportunity to position yourself in the job market and target your dream employers. For instance, if your goal is to apply data science to healthcare, try to find a project that tackles an important issue in the medical field
Do Not Stop There
If you really want to break into a specific industry, you should not stop at four projects. Search for data that might be relevant to your dream employer and experiment with it. Build something interesting and write an article or blog post about your project. The more you showcase your abilities and interest in a specific field, the more likely it is that people in that industry are going to be impressed by you.
Do Not Try to Be Too Fancy
When choosing projects, it is tempting to go for things that sound fancy. Do not do that - at least not right away. Make sure your projects are solid from start through the finish and contain as few errors as possible. Have someone check your projects and review them for you. During my time at bootcamp, I presented all of my projects to my fellow classmates, as well as my instructors. Getting different opinions on your work will help you improve for future projects.
Part 4: Trying to Find a Job
If you want to succeed in the data science hiring process, prepare yourself as much as possible. Check out coding challenges on HackerRank, familiarize yourself with the types of questions being asked, and, perhaps most importantly, document your interview process. Create a document in which you describe and evaluate your experiences from interviews. Then, before each interview, review that document along with your machine learning theory document(s) to help avoid repeating mistakes. Also, take advantage of the mock interviews, coding challenges, and 1-on-1 resume review sessions that NYC Data Science Academy offers.
Learn How to Pitch Yourself
If you want a job in data science, you are going to have to compete with many other applicants. Set yourself apart by creating a personalized narrative. Why are you the perfect fit? Why did you choose your specific projects? Why did you choose data science in the first place? Since you are going to have to introduce yourself in almost every interview, make sure that you craft a strong elevator pitch that can be adapted depending on what company you are targeting.
While you are at it, prepare pitches for your projects, too. Not every potential employer will want to hear you describe all of your projects. Maybe one specific project caught the attention of the people that are going to interview you. Make sure that you are able to describe each project in depth, but also have a backup pitch in case you only have time to provide a short description.
Practice these elevator pitches in front of other people. Thinking of what you might say at home is not comparable to explaining your projects to people you do not know. If you go to school or attend a bootcamp, practice your pitches with your classmates and give each other constructive feedback. Doing so, you might be able to avoid several mistakes during your first interview.
Graduates of NYC Data Science Academy’s bootcamp are encouraged to attend a hiring partner event to interact with potential future employers. Before attending these events, it is absolutely crucial to have learned how to pitch yourself and your projects. Be aggressive when attending such events. Research the hiring managers and recruiters that are going to attend. During the event, try to figure out whether there might be a fit between you and the company as quickly as possible. Hand over your resume and ask for business cards. Another very important piece of advice: do not talk to only one or two people. Even if there is potential for a great fit, do not limit yourself in the number of potential job offers. Thank recruiters after getting to know them and exchanging contact information.
Networking is a skill that takes practice. Luckily, NYCDSA’s bootcamp provides students with extensive advice and tips on how to navigate networking events. Make sure to know the rules of conduct in networking (e.g. writing a good follow-up email to every hiring manager that was in attendance).
However, as with the projects, do not stop there. Network with the people around you. Data science is a fascinating field with many fascinating people. Connect with your classmates if you are in school or a bootcamp. Find interesting people to follow on LinkedIn. Attend data science meetups in your city. There are many opportunities for networking and the more you do it, the better you will get at it.
Finding a Job Can Be Hard: Don’t Give Up!
You might go to many interviews just to have people tell you they will not be able to hire you. Unless you are lucky and find a job right away, getting hired might turn out to be a very frustrating process. Do not despair. If you keep persevering and improving yourself and your resume, someone will notice eventually. Keep believing that you will get the job offer you want. Talk to people that have gone through the data science hiring process and you will see that many of them have had extremely frustrating interviews.
In Conclusion, Remember:
Becoming a data scientist in 2019 is not easy. There will be many hurdles you will have to jump over and many challenges you will have to overcome. Nevertheless, the rewards of making it through the process are huge. Not only will you get to do what you love doing, but also get to engage with very bright people from all kinds of different backgrounds. Just take the first step and the rest will follow. Start pursuing your passion. Making sure that you steadily improve day by day will ultimately get you where you want to go.
NYC Data Science Academy offers in-person, live online, and remote self-paced bootcamps. Apply to the next bootcamp by June 10 to reserve your seat. Visit NYC Data Science Academy’s social media for the latest news and updates.