Corporate OfferingsHiring PartnersStudent Projects
Showing all 11 results
This course offers an accelerated intensive learning experience with Tableau – the growing standard in business intelligence for data visualization and dashboard creation. Without prior experience, students will learn to work with multiple data sources, create compelling visualizations, and roll out their data science products for continuous, scalable outputs to key stakeholders. By building insight and weaving narrative, students will be empowered to harness data in a striking way that provides value to organizations large and small.
This course is a dense presentation of machine learning (ML) tools used in financial risk management, portfolio management, and trading. Ten classes are offered: two on risk management, two on loan portfolio management, three on portfolio optimization, and three on high-frequency trading. The risk classes cover the risk measurement of financial assets using distribution fitting, copulas, PCA, and splines. The loan portfolio management classes cover risk estimation and backtesting using logistic regression, regularization, clustering methods, and the applied statistics concepts such as parameter and process risk. Kaggle competitions for loan portfolios which used tree-based algorithms for predictions are also reviewed. The classes on portfolio optimization introduce classic theories for asset return estimation and their extensions (multi-factor models) while using unsupervised & supervised ML methods to verify & derive new factors; modern portfolio theory using constrained optimization & robust methods; and Black-Litterman model portfolios where asset-specific, ML-derived models are integrated. The classes on trading introduce the limit order book and market microstructure and then move on to tour the winning strategies of to Kaggle competitions on trading. The feature engineering and code of the winning solutions are reviewed in depth.
This class will be an introduction to the statistical programming language R for business analysts. We’ll explore data science use cases in the business realm and use R for data wrangling, data mining, visualization and prediction. Throughout the class we will be approaching business problems analytically and we’ll use R to explore data, make better business decisions and identify areas for improving performance. The combination of data analytics, R and the data science process will provide the foundation for using R for data science business problems. Students should come prepared with an understanding of computer programming and a curiosity for data science.
Via analogy to biological neurons and human perception, this course is an introduction to artificial neural networks that brings high-level theory to life with interactive labs featuring TensorFlow, the most popular open-source Deep Learning library. Essential theory will be covered in a manner that provides students with an intuitive understanding of Deep Learning’s underlying foundations. Paired with hands-on code run-throughs in Jupyter notebooks as well as strategies for overcoming common pitfalls, this foundational knowledge will empower individuals with no previous understanding of neural networks to build production-ready Deep Learning applications across the major contemporary families: Convolutional Nets for machine vision; Long Short-Term Memory Recurrent Nets for natural language processing and time series analysis; Generative Adversarial Networks for producing realistic images; and Reinforcement Learning for playing video games.
This is a class for computer-literate people with no programming background who wish to learn basic Python programming. The course is aimed at those who want to learn “data wrangling” – manipulating downloaded files to make them amenable to analysis. We concentrate on language basics such as list and string manipulation, control structures, simple data analysis packages, and introduce modules for downloading data from the web.
This is a 6-week evening program providing a hands-on introduction to the Hadoop and Spark ecosystem of Big Data technologies. The course will cover these key components of Apache Hadoop: HDFS, MapReduce with streaming, Hive, and Spark. Programming will be done in Python. The course will begin with a review of Python concepts needed for our examples. The course format is interactive. Students will need to bring laptops to class. We will do our work on AWS (Amazon Web Services); instructions will be provided ahead of time on how to connect to AWS and obtain an account.
This 20-hour Machine Learning with Python course covers all the basic machine learning methods and Python modules (especially Scikit-Learn) for implementing them. The five sessions cover: simple and multiple Linear regressions; classification methods including logistic regression, discriminant analysis and naive bayes, support vector machines (SVMs) and tree based methods; cross-validation and feature selection; regularization; principal component analysis (PCA) and clustering algorithms. After successfully completing of this course, you will be able to explain the principles of machine learning algorithms and implement these methods to analyze complex datasets and make predictions in Python.
This 35-hour Machine Learning with R course introduces both the theoretical foundation of machine learning algorithms as well as their practical applications in R. It will introduce you to data mining, performance measures and dimension reduction, regression models, both linear and generalized, KNN and Naïve Bayes models, tree models, and SVMs as well as the Association Rule for analysis. After successfully completing of this course, you will be able to break down the mathematics behind major machine learning algorithms, explain the principles of machine learning algorithms, and implement these methods to solve real-world problems.
This course is a 35-hour program designed to provide a comprehensive introduction to R. You’ll learn how to load, save, and transform data as well as how to write functions, generate graphs, and fit basic statistical models with data. In addition to a theoretical framework in which you will learn the process of data analysis, this course focuses on the practical tools needed in data analysis and visualization. By the end of the course, you will have mastered the essential skills of processing, manipulating and analyzing data of various types, creating advanced visualizations, generating reports, and documenting your codes.
This class is a comprehensive introduction to data science with Python programming language. This class targets people who have some basic knowledge of programming and want to take it to the next level. It introduces how to work with different data structures in Python and covers the most popular data analytics and visualization modules, including numpy, scipy, pandas, matplotlib, and seaborn. We use Ipython notebook to demonstrate the results of codes and change codes interactively throughout the class.
Deposit for 12-week immersive data science bootcamp. In this program students will learn beginner and intermediate levels of Data Science with R, Python, Spark and Hadoop as well as widely used industry tools such as Selenium, Caret, Tensorflow, MongoDB, AWS, and more. A deposit of $5,000 is required within 7 days of acceptance to secure your spot. After making your deposit, the remaining $11,000 is due on the first day of the Bootcamp.