From Material Science to Machine Learning Engineer at Capital One

Posted on Oct 4, 2017

David Steinmetz went from a Materials Science PhD to a Machine Learning Engineer Capital One. As a self-described strategic minded analyst, David dabbled in a few online data science courses before deciding to commit to a full bootcamp.

We sat down with David to learn why he chose to attend NYC Data Science Academy, and how he's able to utilize that experience in his current job.

Tell us about your background. Did it naturally lead you to data science? If not, what gave you the push to enter the field?

I studied materials science, which is a blend of math, physics, and chemistry. I noticed that programming was integral to solving many of the materials science problems, which also tended to be steeped in sophisticated math. I used a genetic algorithm during my undergrad and a particle swarm algorithm during my PhD. After I graduated, I joined a management consultancy and learned about business and data analysis at companies. That background coupled with a genuine interest in computers naturally led to data science.

What were things that made you consider a bootcamp? Were there other things you were considering at the time?

I took online data science courses to get my feet wet. Upon realizing that I really enjoyed the work, I looked at how to learn as much material as possible as quickly as possible. A friend mentioned the bootcamps, and I quickly decided it was the right move for me. I had been considering jobs in materials science, but data science drew me in.

What skills were most useful in helping you land your position at Capital One?

The skills that were most useful were practical experience with a number of machine learning algorithms, project work shown on Github, and a working knowledge of data structures and algorithms. The question an interviewer is really asking is “can this person do the job”. The more project work you have on your public profile, the less of a risk it will be to hire you because the hiring manager can already see your abilities.

What are the tools you find most relevant to your position? What are the skills you thought that was most important?

Python, AWS, Github, Scala, and Spark are the tools which are most relevant to my current position and project. I use Pandas and Spark Datasets often, and Github always. I thought R would be used more, but it’s not, because it’s harder to use R in production. I also thought I would rely more on the standard machine learning libraries, but we don’t hesitate to implement an algorithm that doesn’t exist in Scitkit-Learn or MLlib if it suits our purposes.

Can you describe your day to day job as a Data Scientist?

Often I spend time reading original research papers and books in the attempt to find state-of-the-art approaches to the problem I am trying to solve. The rest of the time is spent coding, visualizing data, bouncing ideas off of colleagues, and creating new products to solve our clients’ needs. I use cloud services and open source software extensively, allowing me to iterate quickly and try new approaches.

What do you find most enjoyable about your job?

It’s varied, mentally challenging, and at the cutting edge of implemented machine learning. The people I work with are amazingly fascinating, and it’s motivating and an honor to be able to work with them.

What are skills your team looks for in a Data Scientist?

We look for someone who is curious, passionate, and well-rounded in the sense that they have experience both with data engineering and distributed systems as well as data science and machine learning. Since we work so much in the cloud, knowledge of cloud services is a plus. A lot of work is done in Scala and Java, so knowledge of one of those two also helps.

What advice do you have for people looking to enter the field?

There is so much to learn in the field, so pick one thing and learn it well before moving on to another. Learning many things superficially will backfire once you get into the interview or onto the job. There deep understanding and the capacity for further learning is necessary. A bootcamp is a great way to get both the deep understanding and cover the breadth of material necessary to get you started in the field. Whatever you do, get advice on what to learn, otherwise what you are learning might not be best suited to your situation.

This interview was originally posted on

About Author

Related Articles

Leave a Comment

Jack October 26, 2017
Singing worshіp songs is nice however that?s not the one solution to ԝorship.? Daddy statеd, perhapѕ to maқe Larrry stop singing. ?Thеre are many wayѕ to worship.
Shruti Agrawal October 7, 2017
Thank yo David for the insightful interview about beginning a career in data science. In the last answer it is mentioned that it is important to know what to learn. Can you please tell me how can we know what to learn? I am struggling with that precisely. Does it depend on the job that one is targeting? Then it could be hard because one does not know which job in which company is one going to get. Also it could be too late till one has leaent , the vacancy already might have been filled. So, how can we comprehend what should one focus amongst the plethora of knowledge in data science?

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup music Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp