Showing all 7 results
Data Science Courses
This is a “short course” of four weeks, with five hours of class per week (split into 2 ½ hour evening classes). Classes will be given in a lab setting, with student exercises mixed with lectures. Students should bring a laptop to class. There will be a modest amount of homework after each class. Due to the focused nature of this course, there will be no individual class projects but the instructors will be available to help students who are applying Python to their own work outside of class.
This is a 6-week evening program providing a hands-on introduction to the Hadoop and Spark ecosystem of Big Data technologies. The course will cover these key components of Apache Hadoop: HDFS, MapReduce with streaming, Hive, and Spark. Programming will be done in Python. The course will begin with a review of Python concepts needed for our examples. The course format is interactive. Students will need to bring laptops to class. We will do our work on AWS (Amazon Web Services); instructions will be provided ahead of time on how to connect to AWS and obtain an account.
This 20-hour course covers all the basic machine learning methods and Python modules (especially Scikit-Learn) for implementing them. The five sessions cover: simple and multiple Linear regressions; classification methods including logistic regression, discriminant analysis and naive bayes, support vector machines (SVMs) and tree based methods; cross-validation and feature selection; regularization; principal component analysis (PCA) and clustering algorithms. After successfully completing of this course, you will be able to explain the principles of machine learning algorithms and implement these methods to analyze complex datasets and make predictions.
This 35-hour course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications of machine learning techniques in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.
This class is a comprehensive introduction to data analysis with the Python programming language. This class targets people who have some basic knowledge of programming and want to take it to the next level. It introduces how to work with different data structures in Python and covers the most popular data analytics and visualization modules, including numpy, scipy, pandas, matplotlib, and seaborn. We use Ipython notebook to demonstrate the results of codes and change codes interactively throughout the class.
This course is a 35-hour program designed to provide a comprehensive introduction to R. You’ll learn how to load, save, and transform data as well as how to write functions, generate graphs, and fit basic statistical models with data. In addition to a theoretical framework in which you will learn the process of data analysis, this course focuses on the practical tools needed in data analysis and visualization. By the end of the course, you will have mastered the essential skills of processing, manipulating and analyzing data of various types, creating advanced visualizations, generating reports, and documenting your codes.