Data Science with R: Machine Learning

Data Science with R: Machine Learning

Data Science with R: Machine Learning

This 35-hour Machine Learning with R course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.

Course Overview

This 35-hour Machine Learning with R course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.

* Tuition paid for part-time courses can be applied to the Data Science Bootcamp if admitted within 9 months.
January Session
$2990.00
Early bird pricing
$2840.50
January Session
Jan 18 - Feb 15, 2020, 10:00am-5:00pm
Want to start right away?
Check out our online option.
April Session
$2990.00
Early bird pricing
$2840.50
April Session
Apr 18 - May 16, 2020, 10:00am-5:00pm

Date and Time

January Session Early-bird Pricing!

Jan 18 - Feb 15, 2020, 10:00am-5:00pm
Day 1: January 18, 2020
Day 2: January 25, 2020
Day 3: February 1, 2020
Day 4: February 8, 2020
Day 5: February 15, 2020
$2990.00$2840.50
Add to Cart

April Session Early-bird Pricing!

Apr 18 - May 16, 2020, 10:00am-5:00pm
Day 1: April 18, 2020
Day 2: April 25, 2020
Day 3: May 2, 2020
Day 4: May 9, 2020
Day 5: May 16, 2020
$2990.00$2840.50
Add to Cart

Instructors

Vivian Zhang
Vivian Zhang
Vivian is the founder of NYC Data Science Academy and co-founder of SupStat. She is an adjunct professor at Stony Brook University and founded the NYC Open Data Meetup, which is 4000 strong. She has many years of practical experience in data technologies and the analytics, and has expertise in multiple programming languages including R, Python, Hadoop, and Spark. Vivian was ranked in "9 Women Leading The Pack In Data Analytics" by Forbes in August 2016. She enjoys meeting people and enjoys sharing her experiences with young professionals and students.
Kathy Liu
Kathy Liu
Kathy holds a PhD in Mathematics from New York University and a master degree from Georgetown University. She is specialized in information theory and probability. Kathy is passionate about teaching and her mathematics and statistics classes at NYU are so popular that seats are filled in very quickly. After serving as a Data Science Consultant in a reinsurance company, Kathy realizes the power of data analytics and the fun of story-telling, then she starts to use statistical models and data visualization tools to conduct collaborative research in Stern School of Business and Courant Institute of Mathematical Sciences at NYU. When not working, Kathy can be found watching Broadway shows in theater district, practicing golf at Chelsea Piers and hiking in upstate New York.
David Romoff
David Romoff
David Romoff is a risk management consultant with 10 years of experience modeling market and credit risk using the latest methods and technologies. David's recent work includes serving as Manager of Risk Management at On Deck Capital, a business lending company in the FinTech space that uses machine learning models to underwrite loans. David was responsible for estimating and reporting losses on the book of loans. Previously, David worked in Enterprise Risk Management at AIG for five years where he designed and supported models on insurance risk, credit risk, and capital allocation. Before AIG, he worked at Bear Stearns in counterparty credit risk. David has an MBA from the Zicklin School of Business in New York City and a Master of Science in Actuarial Science from Columbia University. His undergraduate degree is from the State University of New York at Albany, where he studied psychology and philosophy.

Product Description


Overview

 

This 35-hour Machine Learning with R course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.

Details

 


Prerequisites

 

  • Knowledge of R programming
  • Able to munge, analyze, and visualize data in R

Certificate

Certificates are awarded at the end of the program at the satisfactory completion of the course.

Students are evaluated on a pass/fail basis for their performance on the required homework and final project (where applicable). Students who complete 80% of the homework and attend a minimum of 85% of all classes are eligible for the certificate of completion.


Syllabus

Unit 1: Foundations of Statistics and Simple Linear Regression

  • Understand your data
  • Statistical inference
  • Introduction to machine learning
  • Simple linear regression
  • Diagnostics and transformations
  • The coefficient of determination

Unit 2: Multiple Linear Regression and Generalized Linear Model

  • Multiple linear regression
  • Assumptions and diagnostics
  • Extending model flexibility
  • Generalized linear models
  • Logistic regression
  • Maximum likelihood estimation
  • Model interpretation
  • Assessing model fit

Unit 3: kNN and Naive Bayes, the Curse of Dimensionality

  • The K-Nearest Neighbors Algorithm
  • The choice of K and distance measure
  • Conditional probability: Bayes’ Theorem
  • The Naive Bayes’ Algorithm
  • The Laplace estimator
  • Dimension reduction
  • The PCA procedure
  • Ridge and Lasso regression
  • Cross-validation

Unit 4: Tree Models and SVMs

  • Decision trees
  • Bagging
  • Random forests
  • Boosting
  • Variable Importance
  • Hyperplanes and maximal margin classifier
  • Sort margin and support vector classifier
  • Kernels and support vector machines

Unit 5: Cluster Analysis and Neural Networks

  • Cluster analysis
  • K-means clustering
  • Hierarchical clustering
  • Neural networks and perceptrons
  • Sigmoid neurons
  • Network topology and hidden features
  • Back propagation learning with gradient descent

Final Project

After 35 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.


Recommended Readings

 

  • An Introduction to Statistical Learning with Applications in R, by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
  • Applied Predictive Modeling, by Max Kuhn and Kjell Johnson
  • Data Mining with R, by Luis Torgo
  • Machine Learning with R, by Brett Lantz

Reviews

There are no reviews yet.

Instructors

Vivian Zhang
Vivian Zhang
Vivian is the founder of NYC Data Science Academy and co-founder of SupStat. She is an adjunct professor at Stony Brook University and founded the NYC Open Data Meetup, which is 4000 strong. She has many years of practical experience in data technologies and the analytics, and has expertise in multiple programming languages including R, Python, Hadoop, and Spark. Vivian was ranked in "9 Women Leading The Pack In Data Analytics" by Forbes in August 2016. She enjoys meeting people and enjoys sharing her experiences with young professionals and students.
Kathy Liu
Kathy Liu
Kathy holds a PhD in Mathematics from New York University and a master degree from Georgetown University. She is specialized in information theory and probability. Kathy is passionate about teaching and her mathematics and statistics classes at NYU are so popular that seats are filled in very quickly. After serving as a Data Science Consultant in a reinsurance company, Kathy realizes the power of data analytics and the fun of story-telling, then she starts to use statistical models and data visualization tools to conduct collaborative research in Stern School of Business and Courant Institute of Mathematical Sciences at NYU. When not working, Kathy can be found watching Broadway shows in theater district, practicing golf at Chelsea Piers and hiking in upstate New York.
David Romoff
David Romoff
David Romoff is a risk management consultant with 10 years of experience modeling market and credit risk using the latest methods and technologies. David's recent work includes serving as Manager of Risk Management at On Deck Capital, a business lending company in the FinTech space that uses machine learning models to underwrite loans. David was responsible for estimating and reporting losses on the book of loans. Previously, David worked in Enterprise Risk Management at AIG for five years where he designed and supported models on insurance risk, credit risk, and capital allocation. Before AIG, he worked at Bear Stearns in counterparty credit risk. David has an MBA from the Zicklin School of Business in New York City and a Master of Science in Actuarial Science from Columbia University. His undergraduate degree is from the State University of New York at Albany, where he studied psychology and philosophy.

Product Description


Overview

 

This 35-hour Machine Learning with R course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.

Details

 


Prerequisites

 

  • Knowledge of R programming
  • Able to munge, analyze, and visualize data in R

Certificate

Certificates are awarded at the end of the program at the satisfactory completion of the course.

Students are evaluated on a pass/fail basis for their performance on the required homework and final project (where applicable). Students who complete 80% of the homework and attend a minimum of 85% of all classes are eligible for the certificate of completion.


Syllabus

Unit 1: Foundations of Statistics and Simple Linear Regression

  • Understand your data
  • Statistical inference
  • Introduction to machine learning
  • Simple linear regression
  • Diagnostics and transformations
  • The coefficient of determination

Unit 2: Multiple Linear Regression and Generalized Linear Model

  • Multiple linear regression
  • Assumptions and diagnostics
  • Extending model flexibility
  • Generalized linear models
  • Logistic regression
  • Maximum likelihood estimation
  • Model interpretation
  • Assessing model fit

Unit 3: kNN and Naive Bayes, the Curse of Dimensionality

  • The K-Nearest Neighbors Algorithm
  • The choice of K and distance measure
  • Conditional probability: Bayes’ Theorem
  • The Naive Bayes’ Algorithm
  • The Laplace estimator
  • Dimension reduction
  • The PCA procedure
  • Ridge and Lasso regression
  • Cross-validation

Unit 4: Tree Models and SVMs

  • Decision trees
  • Bagging
  • Random forests
  • Boosting
  • Variable Importance
  • Hyperplanes and maximal margin classifier
  • Sort margin and support vector classifier
  • Kernels and support vector machines

Unit 5: Cluster Analysis and Neural Networks

  • Cluster analysis
  • K-means clustering
  • Hierarchical clustering
  • Neural networks and perceptrons
  • Sigmoid neurons
  • Network topology and hidden features
  • Back propagation learning with gradient descent

Final Project

After 35 hours of structured lectures, students are encouraged to work on an exploratory data analysis project based on their own interests. A project presentation demo will be arranged afterwards.


Recommended Readings

 

  • An Introduction to Statistical Learning with Applications in R, by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
  • Applied Predictive Modeling, by Max Kuhn and Kjell Johnson
  • Data Mining with R, by Luis Torgo
  • Machine Learning with R, by Brett Lantz

Reviews

There are no reviews yet.

Testimonials View All Student Testimonials

Rahul Bhat
Rahul Bhat
Took the weekend course for Machine Learning with R. Course was very helpful in helping me understand the basics of Machine Learning, different models. My instructor was Luke. He was very helpful and would spend enough time covering each topic. He even took an additional class because he didn't want to rush through the material. Overall I am quite satisfied with the results. Would recommend Luke to anyone else who is interested to venture into Machine Learning field.
Lukasz
Lukasz
I studied mechanical engineering and physics for my undergrad at a top university and work in product management with a focus on search. I took this class to satisfy a personal interest in the subject matter and familiarize myself enough with the fundamentals of machine learning to be able to explore the field more deeply on my own. I was also motivated by a career interest: the subject matter is highly relevant to my domain, and I feel that developing an understanding of the concepts and how to deploy them myself will make me better at my job long-term. Prior to enrolling in the class, I spent roughly 8-10 hours learning R and felt sufficiently prepared (I had some previous programming experience). In the end I was extremely happy with this class (Machine Learning in R on Saturdays, 8 hrs at a time). The curriculum and content were excellent, the instructor, Luke, was fantastic and the assignments were challenging and informative. I felt the course did a really great job of driving home the core fundamentals of each subject with a focus on statistics, mathematical theory, derivations and best practices. We covered a LOT of material, yet the material had a lot of depth. I thought the sequencing of the subject matter was very well thought out as well. The class was demanding and had the caliber of a graduate-level course. The course also struck a very nice balance between theory and implementation. After learning about a new model, we would immediately implement it in class using R on our own machines. Luke did a particularly great job at relating the implementation back to the concepts and teaching us how to interpret outcomes of our analyses (I can’t stress enough how important this latter point was for me). He has a really strong grasp of the subject matter, he’s very patient and responsive to questions, offers a lot of insightful commentary on the theory, implementations, and best practices, and he cares about his students a lot. The homework assignments complement the class nicely as well, helping to drive home the methods taught in class and how to interpret your work. If you’re interested in developing a strong understanding of the fundamentals of machine learning in a rigorous format, this class is for you. I also couldn’t recommend Luke as an instructor more. He’s awesome! I was also was very pleased with my choice of the R class. R reduces a lot of the friction in model implementation, which allowed me to focus on developing an understanding of the concepts and interpreting results.
read more
Tingyan Zheng
Tingyan Zheng
Big Data Analyst at
GroupM
This course covers major R machine learning topics; it is intense, and you will learn a lot if you keep up with the pace. Instructor Shu Yan is great at explaining complicated statistical concepts/formulas and translate them into R coding techniques. Course materials, in-class practices, and homework assignment are helpful regarding learning and future references. I would recommend this course to anyone who is interested in data science/machine learning but doesn't know much about this field. It will be a good start for you if you plan to work in this area. It certainly helped me understand a lot about data science and improved my R coding skills. What I learned from this course is worth the money I paid and the effort I put in.
Mark Li
Mark Li
Quantitative Researcher at
Twitter

I took Machine Learning with R and Hadoop data engineering classes in 2015. They are all well-structured classes with extensive information coverage and concrete learning process design. All the techniques been told in the class are very practical and can be applied to work very fast. In addition, it is also a great opportunity to build your "data science" fellow network because all your classmates are "Pro" in this domain with a lot of wonderful industry experiences to share. I would definitely recommend NYC Data Science Academy to my friend!

Margaret Hung
Margaret Hung
SVP, Intelligence Solutions & Strategy at
Millward Brown Digital

As the business world becomes increasingly data-driven, the Data Sciences classes at NYC Data Sciences Academy are invaluable to driving career success, not only for actual data science practitioners, but those who collaborate with them day-to-day to execute on insights to be gleaned from data sciences. I just completed the Intermediate level Data Sciences with R class and have immediately benefited from the ability to understand the different type of advanced analytic techniques that are available to help my clients with their business issues, to better communicate and collaborate with our Data Sciences team on a tactical level and then to take their output and accurately translate it into our clients’ business language. The course was comprehensive and Vivian brings a lot of passion and dedication to the class and ensuring her students’ success.

Date and Time

January Session Early-bird Pricing!

Jan 18 - Feb 15, 2020, 10:00am-5:00pm
Day 1: January 18, 2020
Day 2: January 25, 2020
Day 3: February 1, 2020
Day 4: February 8, 2020
Day 5: February 15, 2020
$2990.00$2840.50
Register before Dec 19th to take advantage of this price!
Add to Cart

April Session Early-bird Pricing!

Apr 18 - May 16, 2020, 10:00am-5:00pm
Day 1: April 18, 2020
Day 2: April 25, 2020
Day 3: May 2, 2020
Day 4: May 9, 2020
Day 5: May 16, 2020
$2990.00$2840.50
Register before Mar 19th to take advantage of this price!
Add to Cart

Online Session

Start Right Away!
Learn More