Roger Ren: Research Scientist (Industry Expert Series)
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Career Accelerator Series: Roger Ren - Research Scientist at Amazon: Alexa
Alumni from NYC Data Science Academy, offers career advice for entry level data science professionals seeking to make a big impact in this industry.
For more events like this: https://www.eventbrite.com/o/nyc-data-science-academy-7825751464
================================
Topic: How to keep a clear mind while job hunting: during pandemic https://www.eventbrite.com/e/amazon-alexa-roger-ren-research-scientist-tickets-101755419198?aff=efbneb#
Video Recording: link
Roger attended NYC Data Science Academy in 2018 and currently works as a research scientist at Amazon Alexa. His work focuses on automatic speech recognition and natural language processing.
Roger actively participates in recruiting and interviewing processes in Amazon and has been a long term alumni of NYC Data Science Academy who mentors and helps students in preparing for interviews.
================================
Transcript/Summary:
Thank you NYC Data Science Academy for inviting me to give a talk. I graduated about 2 years ago, then worked as a TA, and have stayed connected with the bootcamp to assist students with mock-interviews. I understand how difficult it is to learn so much in such a short amount of time and I want to share how to best carry on with it. Iโm not going to touch on targeted interview but rather high-level overview for recent graduates.
About me:
A little about me, I am a Research Scientist at Amazon. Current work focuses on auto-speech recognition and Natural Language Processing. I am involved in the interviewing and recruitment processes at Amazon and as mentioned before help in career and interview coaching primarily with NYC Data Science Academy students as well as his alma mater, University of Rochester.
More importantly, like most of you, I am not 100% a computer science expert. I got his bachelors and masters in Material Science and then decided to transition to data science. I had a long interview process with many companies and struggled submitting resumes and never hearing back. Itโs a very slow grind. I relatively had a broad exposure from start-ups to large well established company. I lost hope several times.
Today, Iโm trying to share something I wish I had been told when I started. These are not the golden rules but I hope to ease some of your concerns and let you know what I think is best to focus on.
Machine Learning in Job Hunting:
To keep this focused in data science, I recommend treating it like a Machine Learning problem.
- Define your problem
- Understand your goal
- Gather training data
- Build models
- Train and test
- Repeat
For many of us the goal is clear: We want a good job at a good company and be proud of that. To do something that matters.
But the goal doesnโt exactly equate to the problems weโre facing.
In the beginning, I didnโt know what the recruiters were looking for. I had to guess, which was a very unnecessary learning curve.
There are some things you should focus on and some that you should expand to a broad range. I believe itโs best to commit to something you are passionate about. If you have no interest in it you are not going to like the job. Once you identify interest in the field then you know what your interests are then itโs easier to find a job youโll like. Donโt limit yourself to โData Analyst, Data Engineerโ. There is a wide range of job titles you qualify for, but you should focus on what you will be doing. The common titles โData Scientist, Data Analyst, Data Engineer, Machine Learning Engineer, Machine Learning Scientist, Research Scientist, Applied Scientist, Business Intelligence Engineer, Business Intelligence Analystโ but theyโre all related to a specific function. Your energy will be drained very very fast if you are applying to all search results on LinkedIn or other platforms. It is best to choose what aligns with your long-term interests and career goals and within that expand your search with all of these different working titles.
If you are already very strong in coding I would consider Software Engineer is a common track, but it really depends on the field not necessarily the title.
Company Scale Matters:
When I first graduated I was very ambitious. In my mind I didnโt want to consider any company that wasnโt a big name like Facebook, Google, etc. I went through a really slow grind.
What I didnโt know was that recruiting workflow is very different for a small company versus a large company. Small companies usually straightforward, typically give you a 30 minute phone call and then potentially 2 on-site visits. It is more dependent on what your schedule allows and if all goes well, you will likely finish the entire process within 3 or 4 weeks.
In a mid-sized company, which refers to any company that does not go public yet, the process becomes a little more complex. Youโll be reached out to by a recruiter (usually a 15-20 minute phone call) to touch base and test interests. Then one of their HR/company representatives will contact you again (30 minute phone call). If they like you theyโll move to an online assessment (coding challenge) and they will review the results. If you pass then they will extend an on-site visit, discuss results and get back to you. As you can see, this process likely takes more then a month, probably 4-6 weeks.
For a large company, itโs even more intensive. Similarly a recruiter will contact you, then HR, then the online assessment which is usually much longer and harder and can take place over a few weeks. Sometimes they could even give you two phone screens. If they give you one, thatโs usually a good sign that you are a strong candidate. The second phone screen usually means there may have been concerns in the first call. You would usually hear back within 2-5 days. The better you perform the faster the response rate is. Finally, if all goes well, you may move to the onsite stage. Here you are put through an exhausting full day of five to six rounds of interviews and you go home and wait. The company then reaches out to a hiring committee and once your pass the hiring committee youโll have to do the team match. One team will have to pick your profile and that could be a three to four month process.
You see what Iโm saying here, the process and overall timeline takes much longer. Referring back to what we said before โtreat this like a machine learning problem. Nobody can get their model working to get the best hyper parameters in the first iteration. If you actually get it the first iteration it means one thing you put your test data into your training data so over here if you just focus, if you do not have a healthy portfolio, as I used to naively do, you are limiting yourself to the iteration which is a very long battle. After trying so many times, I later adopted the method of artificial play or intentionally tried to establish an interview portfolio. Here I mixed the company strategies so that I could get more practice and more exposure into thinking what and how to convince or sell myself in a pressured environment.
Work-flow and Timing:
One hard lesson I learned was to be ambitious but have a healthy portfolio in your job solution. And whatโs interesting is that, for example, letโs say that you are applying to your big name dream company. There are chances youโll submit a resume and youโll directly get a phone interview and an on-site and easily get to the final stages, but would you really want to go there unprepared? You donโt want the first shot you take to be for the one thing you really want to do. Learn from other interviews to strengthen your chances during the jobs you take very seriously.
I should point out R stands for recruiter, HC stands for hiring committee. And again, the online assessment can be in the format of an online coding challenge or a data set challenge. I will clarify terms used in the slides again later on because we will come back to talk about how exactly that interview procedure goes.
Another thing besides the scale of the company youโll need to pay attention to is the hiring process. Basically there is a high season and low season, especially for big companies. The high seasons are basically the spring and fall. Usually February to May and July to October. I used to think this was a myth as a job seeker but now Iโm on the other side of the table where I can see the release of the headcount of the job openings and this is absolutely true.
For instance, come May youโll see a whole bunch of resumes and then everything will just slide down in the summertime and then July to October thereโs another huge burst of openings. So you should try to write along the way to try and be careful to schedule your self-improvement/study time as well as the application time. Letโs say for example now is April, it would be a good time to apply. But letโs say, by the end of May you should expect to see the postings gradually go down. Donโt feel discouraged or think that โthis is just how it worksโ.
The reason behind this is actually very straightforward. In the beginning of the year or around January, everybody at the company has to sit down to discuss what weโre going to do for this whole year. We need to see what our goals are, what we ultimately need to achieve, where weโll need to prioritize, where funding should go, etc. Thatโs generally a really busy month for companies and to have a full scale of recruiting during that time is unlikely. So starting from February the new grads who come graduating from the winter semesters are going to start to look for jobs. By May all of the summer internships are done, all of the positions are filled, and take into account that most people go on vacation during the summer so thatโs why sometimes summer processes can be a bit slow. And again for the fall process, the spring graduates will want to apply before the middle of October because during holiday season everything dies down. Not only is the company beginning the end-of-the-year assessment, but the entire company is getting re-evaluated so it is difficult to move forward with recruiting.
I know that a lot of students at NYC Data Science Academy are not new grads. I was a new grad so I kind of gravitate towards this more since itโs part of my story. But I know that there are a lot of experienced talent at NYC Data Science Academy, I donโt need to reiterate they already have a ton of field experience in their field. So really if they are the very top talent in the field, there is no limit and they could apply as if recruiting happens 365 days a year. However, the bar is really set even higher for a company to be able to look at you outside of their usual hiring cycle.
Key Idea: Understand the process + Voting Mechanisms
To address how COVID-19 has been impacting the recruitment cycles. I think that is a very good and very difficult question, I think the impact at least for my company is relatively limited at least for now, but maybe Vivian can jump into it more if she has a broader visibility coming across a lot of different jobs in different industries. I only really know whatโs happening here so we can come back to the impact of the pandemic later.
So far weโve talked about this workflow, and the timing, again my main takeaway message is to identify your problems. We know our goal, but to know your problem really means understanding the companyโs recruiting procedures, to take into account low and high recruiting seasons. This way you know what you should prioritize and go on accordingly.
Next, I would like to really be candid with you guys about something I learned as well as when I was giving interviews. While I was getting trained to give interviews, we have to think: โWhat exactly are we looking at?โ
I think this can be universal knowledge across all the top-tier tech and finance companies because their recruiting procedures are very very similar and people are trained under the same methodology. In short, the conclusion is that itโs quite okay for companies to make false negative decisions but they are absolutely not willing to make any false positive decisions. This translates to you may be very well qualified for the position but they reject you because every single day they receive thousands of resumes and you are simply replaceable. Everybody, anybody is replaceable. If they make the wrong hiring decision this is a disaster. Why? Because if the recruiter or HR hires the wrong person it is going to cost the company a lot. At least four to five hours of additional pay for each of these people to look at an assessment alone, then we can consider that the phone interview and behavioral question interviews can be another three or four hours.
They are paying a very highly trained scientist or engineer one hour of their time to give an applicant time. And every single time we have to write down the interview summary and create an assessment report, this takes a lot of time. Not to mention, when I give a phone interview I will generally use 30 minutes to prepare and read over the resume. I will prepare questions and use a full hour for the phone interview. I will spend at least two to three hours writing so overall Iโm spending close to half a days work or even sometimes a whole-dayโs work to conduct these phone interviews. For the on-sites it is even more expensive because they may have to fly an applicant here and maybe get you a nice hotel and arrange five to six people like me to spend at least one hour with you and another hour or two for their assessment reports. Then theyโll have to gather the hiring committee and have their even more experienced staff to raise the bar and discuss each of these summary reports. As you can see, it is a very expensive process. It can usually cost 30-45k to make a hard decision. And if you think hiring is already expensive, laying off people is even more expensive because you have to pay all those things during recruitment and then not get any return on their investment. False-positive is when they absolutely know they donโt look for the recall, they only look for the precision change.
As I said in the beginning, Iโm actually pretty good right but I got rejected and it took a big hit on my confidence. It was a really hard reality check but I realized it wasnโt because I was not proficient enough in machine learning. It is the very fact that I didnโt understand the game they were playing. That is why I have this third point of understanding the interviewing process. I donโt like that answer, but I must admit it is absolutely true. What you need to do is prepare for an interview and what you need to do is actually practice.
I think less than 5% I practiced a lot of legal questions. Iโve been working close to two years now and all of those fancy algorithms you know I probably used that once or twice at most โ thatโs it. In production, in rotating work code the maintainability and repeatability to make everything design well is far more important than obtaining a super clever method. As far as you can think about that super clever method, some people who are smarter than you or perhaps joined the company earlier than you already implemented it and that is fact. The people they tend to hire are similar people to themselves and you just have to go along being able to reiterate what theyโve already done. Unfortunately it is a game, especially in big tech and finance companies. You just have to learn. In one way if you focus on learning all of the hard problems you may see a big variance between your interview and your daily work.
The final thing is about the voting mechanism that comes back here. So after the on-site you have five to six people and it is very rare that you will get a unanimous yes. There will always be someone who is not very happy with your performance. But that is okay, as long as itโs only one โnoโ. Again donโt feel discouraged. Letโs say if you hit heads with one of your interviewers, that doesnโt mean you should give up the entire process. You donโt necessarily need to satisfy every interviewer. If there are two people saying โnoโ thatโs typically a case closed kind of situation.
I want to try to keep this specific talk focused on a higher level overview and hopefully touch on more technical things in another talk later on.
Resumes: Iteration!
Next, I want to touch on your resume. I know NYC Data Science Academy offers great support to polish your resume to this level and you can have a strong case in having a job advocate, but I want to reiterate one thing that itโs like playing a video game right. Youโll go through this stage first and then go to the next stage. If you are still submitting your resume and not receiving any call backs or response for interview this means that your resume hasnโt reached the bar yet. The only purpose of a resume is to get you that phone interview, after that no one really looks at it in detail. Let me give you a solid example, when we were given a request to conduct a phone interview for a candidate we will see the entire candidateโs portfolio and the resume will be at the very very bottom page of the file. Trust me, no one scrolls down that far to read it. What people will read is the notes that the recruiter and HR already made on your profile, and after someone like me conducts interviews, the next person will do the same looking first at my comments and so on. So you start to see what Iโm talking about, if you still did not get any response or if your phone interview ratio is low then spend more time on your resume. If you consistently get 80% response rate then your resume is good enough and thereโs no reason to worry about it, move onto the next stage.
Resume screenings nowadays, is a machine screening NLP work essentially. Essentially you want to find the most similar documents so thatโs where you can see some data science tricks, but back to where I first started you have your field that youโre interested in and you have your companies. Right now, again, go crazy. Expand that job title, grab all the keywords in and do some paraphrasing on those key phrases. You will be very surprised to come with your own bag of words or model that are suitable for those kinds of positions and that will pretty much guarantee you pass the machine screening part. Donโt try to use exactly the same word, donโt copy and paste thatโs bad, but do use nearby key terms. Keep many iterations so that you will be able to see solidly that your resume passes the bar as you apply in the market through keeping track of your metrics to see what works and what doesnโt work.
More on the pandemic/ Becoming a Hassle-Free Package:
To further address the COVID-19 situation, I recently stumbled upon this very interesting website: (candor.co/hiring-freezes/) it is a list of data indicating hiring freezes and lay-offs. These are the active industries still hiring, you can see that most cases it is still moving forward. I havenโt seen any cancellation on headcount but it seems the entire procedure is delayed. A very important factor here is that face to face interaction, which seems to be the reason for the delay in some hiring processes. Iโve seen some colleagues get offers from companies like AirBnb and since those companies are getting hit really hard right now the offer had been rescinded. For the medical field it seems statistics are growing, I feel that our company is taking more of a hit but itโs relatively manageable at this rate. If this continues until July, since it is uncertain when this will end, there will definitely be some rolling factors that comes back to bite us. For your end, weโre still in the beginning of April even until the beginning of May it is still a good application season. If you see the job openings dwelling down there can be two factors, either the low hiring season is starting or COVID-19 can be limiting the supply and demand balance.
To address one question in the chat, is it important to clean up your github? I donโt personally look at your github, but there are some cases that theyโll want to look at it and your project portfolio. It is definitely a plus point. It will not hurt you to have a very clean and maintained github and a nice looking portfolio. But remember that people are lazy, Iโm already spending so much time on my conversation with you that Iโll feel as though I already have enough data points to make a decision on you. Typically, Iโm just too lazy to go over your github or personal website to know more about you. Small companies on the other hand, they pay a lot of attention to this. I was once reached out to by a small start-up because of a small project that I put on my github page. In the beginning, I thought it was just an HR strategy but you can tell he really read through it. Small companies can afford to really spend time to get to know you before making any judgements. Unfortunately, that doesnโt happen at big companies. You need to have a package thatโs โhassle-freeโ. For example, when you buy a new iPhone, once you open the box the experience is very pleasant. It is nicely wrapped and clean, good to go โvoilaโ. But if you buy more of a strange product, thereโs glue everywhere thereโs foam everywhere. This doesnโt meant that the product is not good, but the experience is more of a hassle. Big companies canโt afford to spend so much time, so you really have to present yourself as a hassle-free package.
Getting an offer:
I always like to use this formula: P(offer) = P(Luck) * P(A) * P(B) * P(C) โฆ
As you can see, there are a number of different probabilities. The number one thing is that you need a certain amount of luck. I donโt think that anyone can guarantee that theyโll get the dream job anytime they want. The way I see these factors is categorized in non-negotiable items. You need to show your proficiencies and science breadth/depth and good coding skills. If you donโt show those two very easy areas that may lead to a very easy โnoโ. There is some wiggle room for other other factors, especially for new grads that donโt have too much working experience or havenโt worked on real production code or production project that is fine as long as you can demonstrate that you know how to implement simple concepts. There will always be behavioral questions, again similar to the hassle-free experience if you answer really well itโs like extra credit. Unless you really really mess-up, I have seen people get rejected based of some answers to these questions. But if you do well, itโs a plus point. Extra credit that adds onto those things you may need some wiggle-room for.
You should deliberately try to present that you have these two qualities. Show them that you have a character that is likable and coachable.
Likability comes from putting yourself in the interviewers shoes. Employers want to first evaluate the technical ability to do the work but second most important is determining โDo I want to work with this person every single day?โ. Iโm not saying to suddenly become this bubbly person to make everybody like you, but to have the mentality change so that when youโre going through an interview you remind yourself that you need to win the person over. Youโre not there to show them all the fancy stuff that you know. Instead of proving yourself, prove that you can work with other people. I learned this the hard way. I really tried to focus on displaying that I have the technical capability and negated this part. The more I interviewed, I realized how crucial it is to make yourself relatable.
The second characteristic to be coachable. I learned this from one of the โbar-raisersโ I mentioned earlier. In a post-on-site interview meeting, everybody said okay but one mentioned that this candidate seemed to lack the quality of being coachable in the future. Towards the end that of that discussion we ended up terminating their candidacy. The reason is really simple to be honest. In daily work, there is almost always some ambiguity. Youโll make an effort to identify the problem and there are zero cases where you have a clear solid path solution. When dealing with ambiguity you have to be open to trying this and that. We were in a discussion for a particular machine learning project design question. The candidate was very strongly opinionated spent a long time trying to convince us that he was right which gave us a feeling that he was refusing to listen or partake in the discussion to indicate multiple solutions. He may have gotten the problem right but he failed to show that he was coachable. I often use the analogy of how It is similar to dating right.. probably an improper comparison but has a bit of truth to it. The more you are wanted or high in demand the more likely you will be in a relationship. If youโve spent time on so many dating apps and you still donโt have a girlfriend or a boyfriend, again sorry for using this example, but typically itโs harder for you to get one because maybe youโre not necessarily looking for the right things and youโre exhausting all of your resources trying to prove the same point over and over again. You have to be open and willing to grow. It doesnโt matter that you win the argument, the discussion is really to show that you are able to show that you are capable of having a constructive and intelligent discussion with your peers.
It is good to keep these things in mind. Try to present these two characteristics throughout your interviews to increase the likelihood of getting hired. Lastly, if youโre an international student like me you need to have your paperwork right otherwise you will get shut down.
A few examples:
To answer some questions in the chat: An example of very common question that is asked to address the science breadth and depth: How do you identify/solve overfitting?
Letโs say you answer: Regularization. And if your answer is very focused on regularization, I would kind of get the impression that your knowledge may be limited. But say your answer starts with: Okay, that depends what kind of models youโre looking at. If you have this model then do this, that model then do that, and so on. I can easily see that you have a broad range of knowledge, thatโs breadth. However, it is important to note that showing you have a broad range of knowledge is not sufficient. Letโs stay you stick to the first answer: Regularization, you can use L1 L2 to improve the overfitting problem. Then an easy follow-up question would be, okay what exactly is L1/L2? Why does L1/L2 cause this? And you can go to the level of deep explanation using the example with the famous shape right, the square and circle shape. That is okay, if you prepare it you can go more in depth.
If you truly go under the fundamental reasons. I think fundamentally it is a constrained optimization problem by the first order and second order you are essentially simulating two different distributions and one of the distributions tends to concentrate on zero the other one doesnโt. There is the fundamental science reason I can prove how I know regularization can solve overfitting and display the depth to my knowledge. If I ask you the overfitting question and you give me several options I will give you a check mark next to the breadth part and then I will ask follow-up questions to further test the depth of your knowledge. If you exhaust all of my questions then youโll definitely get a yes mark on the dot but that doesnโt happen very often. Most tend to stop in layer 2 or 3 but thatโs sufficient enough already.
How do I evaluate a smooth interview?
Iโll ask you a question you answer me. I follow-up. You answer. I follow-up. You answer. I follow-up. You answer. Iโm satisfied, I move on to the next question and repeat. For example when you first answer my question that is at the surface, my follow-up question will dive deeper and if you can hit some of those points thatโs great! Iโm not looking for a magical unicorn candidate that knows everything. But if you can check both boxes knowing a wide range, relatively deep thatโs what Iโm looking for.
A very smart question that I like to prepare: Letโs say you are using a rainforest to do a classification right? To do a binary classification โ If you change the x and the y axis, whatโs going to happen? Thatโs a very interesting question. So if you do understand how it works fundamentally, youโll be able to answer this very easily. If not then people will just start making up answers and itโs very obvious to the interviewer. Again, be prepared and able to explain the fundamental concepts.
Key takeaway: Stop focusing on you. Focus on the people around you in an interview.
Before I move on to my final slide, I want to say a little more about that luck factor. I do believe that some people are luckier than others, that just happens. But I think each single person gets the exposure to try this test. If you get prepared every single day then one day or the other itโs going to hit you. But if you donโt have a consistent output, then you give yourself rather low probability conditions.
Take care of yourself: Keep Calm & Carry On
This brings us to my favorite part. Since most of you have just graduated, you probably havenโt felt too much of this pressure yet. In my time, I spent quite a lot of time doing all those different interviews and learning exactly all those things I just discussed. At that time, I was really working hard and focusing on how to improve myself. To do more coding questions, to read more of these books, to do more projects.. and I forced myself to learn as quickly as possible. Once I got through the tunnel, I realized thatโs not what a healthy approach should be. Looking for a job is absolutely a full time job. Think about it, you have to search for a job, edit resumes, submit the application, and find yourself with 30-40 loose ends that you now have to keep track of. If thereโs a phone interview or onsite you have to schedule and prepare and research more into each company right โ your time, your resources, your energy is very very limited.
You need to work smart. Be economical. Know what your top priorities are. Keep track of what is worth high investment by paying attention to what youโre getting in return. Donโt fool yourself by saying that youโre too busy every single day. If you are not seeing results those efforts mean very little. Plan accordingly.
Try your best to have a study buddy. Work with friends to practice coding questions with each other. Less of a quiz or testing type of problem but being more open to discussing with each other what you did. Naturally youโll establish the habit of instead of proving that youโre a strong candidate you will be able to naturally show it by being able to explain to others.
The last is definitely the hardest: Unlearn quickly. Once you apply to a job, forget about it. Donโt go back to the job portal, donโt sit there refreshing your email. Once you press send, itโs totally beyond your control. Trust me, you wonโt miss an email that says: Hereโs a phone interview. Limit your loose ends, minimize, and focus on maintaining a clear mind to best utilize your time. Either work hard in improving yourself and your skills OR work hard at applying. Do yourself a favor and unlearn what you already put out there. Focus on adapting and learn how to play to your strengths.
================================
We appreciate your feedback. If you enjoyed this article and informational video series, please fill out this short survey for us to create additional resources and host future sessions with more relevant content for you.