Data Interview with a Weekend Student: Yiqun Wang

Posted on Apr 15, 2015

Project GitHub | LinkedIn:   Niki   Moritz   Hao-Wei   Matthew   Oren

The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

reported by Karl Smith

So tell us a little about your background.

I did my undergrad in statistics back in China, but I was one of the worst students there. Then I came here to New York for a business degree in graduate school.

I’ve been working at a real estate technology firm for 3-4 years before joining the weekend class.

When you were first considering taking the class, what did you hope to get out of it?

Hmm. I wanted to learn R software. It’s powerful, cool and open source. Python is cool as well, actually you can learn either one of the two. At the same time, I want someone to walk me into machine learning and data science.

My expectation is that I wanted to learn the language and I wanted to do some real coding work. Build a model and apply it in real life.

So you took the beginner and the intermediate classes in R?

Yeah, five weekend days for each level. Five Saturdays for the intro class and five Sundays for the intermediate class. I took them at the same time which is unusual. The first few intermediate classes are really hard because I am pretty new to R. The intermediate class assumed that you had already finished the intro class. It was a little bit hard but I made it.

So what about Vivian’s courses attracted you? Did you look at any other competitors?

Yes I did. I am aware of one called Metis, but they don’t offer part time classes. I do remember General Assembly just started to offer a data science class as well. I’m actually also taking a Ruby class through General Assembly.

I think NYC Data Science Academy stands out because it has a different approach. They try to overwhelm you a little bit, and I love that approach. If I choose to attend a boot camp or a part time class, I don’t want to be the smartest person in the room. I want to be in a class where I am the dumbest student in the room.

My Ruby class at General Assembly, on the other hand, is different. The teachers are great, they explain things very well, but it’s so slow. Especially as I'm taking the Ruby class and the R class at the same time, so I can tell the different paces. I do appreciate both approaches but for data science, I would rather be overwhelmed.

For potential applicants, that’s something you need to keep in mind: you have to study very hard yourself and you will walk away with a lot of new skills and knowledge.

So what was it like, taking the class like that? Challenging, fast paced?

Yeah. I joke about this all the time. So Vivian basically taught some of my full year undergrad classes in an hour. You know, giving you the knowledge and the tools to make the thing work, without too much of the theory side. I hope that my undergrad professor also teach more applications. If they did, I would have been a much better student.

You need to learn how to write a script, how to leverage the best packages, instead of learning all of the nitty-gritty theories behind it. Well, that’s kind of my biggest takeaway from the part-time classes.

Can you describe your final project for us? Give us the elevator pitch.

Okay. My final project is to leverage all the statistical and machine learning algorithms we learned in the class and apply those to a public data set. I picked Jersey City’s condo prices, which is a topic of my interest. My goal is to create a model that will be able to predict the price at any point in time. And it’s a very fun project. Some of the same techniques can be applied to my work, thus I’m very motivated in completing it.

How has it affected your career after taking this class? What’s the result of it?

Before the class, I was more like scratching the surface of data science, with some memory of the theory side of statistics but don’t know how to leverage a tool (such as R) to put together a sophisticated model. I think I am very into data science right now. After taking the two weekend classes, the introduction and the intermediate machine learning classes in R, I can comfortably say that I understand what is going on. I can write some simple models, some scripts, if someday our company is really committed to do more machine learning stuff, I can manage a team that will do exactly that. I think that helped my career tremendously.

I believe most of the applicants for the boot camp are either looking for a transition or are students. I speak for those who already work, not planning to “jump ship”, who want to learn stuff from the class and take it away and apply it to their own corporate career. And I think, if you are a technical manager, you know that your team is doing some of the activities such as machine learning. So, you know, sparing several weekends doing that is definitely very helpful.

The class is very hands-on, it gives you a lot of insights, it makes you think about the company’s future, your research and development pipelines, there are a lot of new thoughts that came out of the class. The class is very technical, for sure, but as a business manager, you keep thinking about how these can be applied to your own company’s business model, what other things you can offer, given your data, and other expertise.

You came in with only some exposure to statistical knowledge and programming, but you hadn’t really had that much in-depth knowledge about it. Did you feel like you were able to learn and keep up with the pace with your knowledge?

As I said, I think that Vivian will give you some theory stuff to scare you off. For the theory part I don’t really get it and I pretty much give up. I know that I’m not someone who is going to excel at statistical modeling or theories or any of those things. I just want to know how to apply it. I’m good at just making something quick and dirty and making it work.

So if you ask me whether I pick up a lot of theories, no. But it fulfills my needs. My needs are to walk away with ideas, hopefully about how the process should be, and I can quickly put together some code, concept proving something, which is important.

In the corporate world, you can’t invest too much energy spending all the time doing one project and it end up not working great. You want to build a prototype first. Bid small, make mistakes and move on to the next project. And my level of R skills finishing those two classes is enough for that. Grabbing some data, putting together a model and just test it and look at some of the diagnosis statistics, look at the mean squared errors of various machine learning models. Totally fulfills my needs.

What was something you were worried about when you were first researching data science courses and considering taking one? Were you worried about not having the time? Not having the money?

Yeah, both. Time commitment is very big. My wife is actually pregnant, so I know it is going to be worse when the baby comes out. I think it was a good time for me to stock up more skills before the baby comes out. Otherwise I won’t have the time to do all those weekend classes.

What worried me before the class was that I thought it would be too difficult for me to understand. It turned out that the intro class is pretty easy for me. I can grasp maybe 90% of the content of every class. And the homeworks I can easily do by the end of the class day, actually.

The intermediate class is much more challenging. As I said during the final presentation [of his final project], I pretty much skipped all of the theory parts, so I only grasped how to learn the models and how to do some of the diagnostics, and that’s pretty much it. I don’t really understand the underlying theories Vivian taught us. But I still think it fulfills my needs.

I think at the end of the day, you still get a large amount of stuff you want, depending on how much commitment you are willing to make into it. I didn’t have that much time after the class, so I can only take away those.

What would you say to someone considering taking a class like this?

For those who are managers, business analysts, whoever is working in an industry somewhat techy, I think I highly recommend them to keep taking classes, make sure they are up date on things. There is a lot going on. If your industry is a little bit old, like commercial real estate where I am in. There will be a lot of disruptors coming in very soon.

So, make sure that you take actions. Take a class if you are young when you can still learn coding. If you are too old and you can no longer do coding, at least watch some youtube videos about what machine learning is, what data science is. It’s going to disrupt every industry. Sometimes it’s scary to think. If you don’t embrace it, you will be punched very hard a few years from now. I do mean that.

I could give a more specific example?


I was recently made aware of a company called Kira Systems. They use machine learning algorithm to solve a challenging pain point for real estate attorneys. So think about it: a big office building changes hands. It happens every day, like a big office tower, shopping mall, or a big multi-family building changes hands.

The potential buyers who are bidding on the building must understand the building very well, inside and out. Part of the due diligence is to read all the existing leases in the building, one-by-one. Residential leases are pretty standardized, they’re fine. Commercial leases are much more complex. Every lease can be a hundred pages long. As an attorney you will have to find out lease start date, lease expiration date, rent escalation, tenancy improvement clauses, tons of different clauses. The traditional way of doing it is to hire a lot of JD students or paralegals to go over the contracts one by one—good luck! Boring and costly.

So this company is founded by lawyers and machine learning scientists joining forces to use some of the public databases of leases to extract structured information out of these unstructured contracts. I think they will be a very successful company.

So imagine you if you are a law firm. Your cost structure is based on running things manually, this company is going to disrupt your business model.

It’s just an example, I just learned about that company last week. There are hundreds of such companies disrupting every industry. What I just said, understanding commercial leases in real estate, is a very very small niche. But that company tackles it. And that company is also trying to vertically expand to other industries.

If you are working in an older industry, applying data science may sound a little bit futuristic thinking, but it may not be that far as you thought!

About Author

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI