Advanced
Designing and Implementing Production Machine Learning Systems (MLOps)

Designing and Implementing Production Machine Learning Systems (MLOps)

This course is an introduction to ML systems in production that will demonstrate and give students exposure to how real production ML systems operate. Using Python, Docker, Kubernetes, Google Cloud and various open-source tools, students will bring the different components of an ML system to life and setup real, automated infrastructure.

Clear
* Tuition paid for part-time courses can be applied to the Data Science Bootcamps if admitted within 9 months.
In response to COVID-19 State reopening, all our courses are hosted online.

Course Dates

 
September Session

Sep 27 - Nov 1, 2022
Tuesday, Thursday
7:00-9:00pm EDT

$2990.00
Enroll Now
Earlybird ends on 10/25
November Session

Nov 15 - Dec 20, 2022
Tuesday, Thursday
7:00-9:00pm EDT

$2990.00
$2990.00
$2840.50
Enroll Now
Find out more information about our professional development courses.
DOWNLOAD COURSE INFORMATION

Product Description

Course Overview

As machine learning (ML) becomes ubiquitous in technology, there is an increasing need for well-engineered ML systems and processes that enable ML algorithms to drive business value. Enterprise ML has experienced a shift in focus from just the ML models themselves to the software engineering, infrastructure and best practices necessary to support ML at scale in production. Bringing a model from a data scientist’s notebook to running live in an application requires robust systems, MLOps and ML governance.

This course is an introduction to ML systems in production that will demonstrate and give students exposure to how real production ML systems operate. Using Python, Docker, Kubernetes, Google Cloud and various open-source tools, students will bring the different components of an ML system to life and setup real, automated infrastructure. It will be mostly in Python, docker, kuberentes, and google cloud in addition to lots of open source tools.

Prerequisites

It is expected you have familiarity with an object-oriented programming language (preferably Python) and experience with basic machine learning concepts and models. Some previous exposure to a cloud environment (AWS, Google Cloud, Azure, etc…) or other software engineering experience would be helpful but not necessary.

Certificate

Certificates are awarded at the end of the program at the satisfactory completion of the course. Students are evaluated on a pass/fail basis for their performance on the required homework and final project (where applicable). Students who complete 80% of the homework and attend a minimum of 85% of all classes are eligible for the certificate of completion.

Demo Lecture

Quick Course Overview of ML Ops class
Module
Overview
Instructor
Kyle Gallatin
Description
One minute overview of what you will be learning in this course.

Syllabus

Unit 1 - Overview of Machine Learning Systems in Production

  • Machine learning in industry versus academia
  • Comparing ML engineering and software engineering
  • Components of production ML systems
  • Online versus offline ML systems
  • Demonstration: a production ML system
  • Hands-on: Introduction to Google Cloud, project setup, and gcloud commands
  • Hands-on: Setting up our git repository

Unit 2 - Machine Learning Engineering Fundamentals

  • Software engineering principles
  • Systems design 101
  • ML Systems design 101
  • MLOps concepts and design principles
  • Hands-on: Essential Google Cloud services for ML
  • Hands-on: Kubernetes and Google Kubernetes Engine (GKE) intro
  • Your ML in production project: Ideating

Unit 3 - Feature Systems

  • Introduction to feature systems
  • Common feature systems design patterns
  • Developer experience in feature systems and ML systems
  • Hands-on: Working with different feature sources and data stores on Google Cloud
  • Hands-on: Building a miniature feature system in the cloud
  • Your ML in production project: Ideating

Unit 4 - ML Model Training Pipelines

  • Components of ML training pipelines
  • Workflow orchestration and automation
  • Cost and value analysis
  • Setting up an ML pipeline
  • Hands-on: Introduction to Kubeflow and building an automated pipeline
  • Hands-on: Running training automated jobs on Kubernetes
  • Your ML in production project: Design and Planning

Unit 5 - Managing Training Experiments, ML Metadata, and

  • Model Registries
  • Experimentation as an ML practioner
  • Hands-on: Setting up a centralized metadata store and model registry
  • Hands-on: Tracking and logging hyperparameters
  • Hands-on: Using model registries
  • Your ML in production project: Design and Planning

Unit 6 - Deploying Machine Learning Models

  • Generating offline predictions
  • Online model serving systems
  • Common real-time deployment architectures
  • Hands-on: Developing an automated offline prediction workflow using Kubeflow and Dataflow
  • Hands-on: Deploying ML models on Kubernetes for real-time inference with Seldon
  • Hands-on: Scaling ML model deployments
  • Your ML in production project: Architecture Review

Unit 7 - ML Observability

  • Infrastructure and software observability
  • Latency, throughput, availability, and reliability
  • ML observability, ML model/feature drift, and ML explainability
  • Fairness and bias
  • Hands-on: Setting up Prometheus and Grafana on Kubernetes
  • Hands-on: Accessing logs and metrics in Google Cloud
  • Hands-on: Logging predictions and implementing ML observability
  • Your ML in production project: Architecture Review

Unit 8 - Experimentation and Reliability Engineering

  • ML experimentation design and algorithms 101
  • Hands-on: A/B testing with Seldon on Kubernetes
  • Hands-on: Multi-armed bandits with Seldon on Kubernetes
  • Hands-on: Canary/shadow deployments on Kubernetes
  • Your ML in production project: Implementation

Unit 9 - Continuous Learning

  • Streaming versus batch processing
  • Event-driven, asynchronous systems
  • Stateful ML systems and incremental model updates
  • Hands-on: Designing and implementing a stateful ML system on Kubernetes
  • Your ML in production project: Implementation

Unit 10 - Machine Learning Governance

  • Observability, visibility and control
  • Monitoring and alerting
  • Model service catalogue
  • Security
  • Compliance and auditability
  • Your ML in production project: Presentation.

Campus Location

500 8th Ave Suite 905, New York, NY 10018
Nearby Subways
1 2 3 34th, Penn Station
A C E 34th, Penn Station
N Q R B D F M 34th, Herald Square

Instructor

Kyle Gallatin
Kyle Gallatin
NYC Data Science Mentor
Kyle Gallatin is currently a software engineer on the machine learning platform team at Etsy. In this role, Kyle is redesigning existing ML systems with a focus on ML model training, real-time model serving, MLOps processes, and model governance. Kyle spends his free time teaching and volunteering within the ML space. He also writes articles for technical publications on ML engineering, MLOps, and infrastructure.

Session Schedule

 
September Session

Sep 27 - Nov 1, 2022 Tuesday & Thursday
  • 1September 27, 2022
  • 2September 29, 2022
  • 3October 4, 2022
  • 4October 6, 2022
  • 5October 13, 2022
  • 6October 18, 2022
  • 7October 20, 2022
  • 8October 25, 2022
  • 9October 27, 2022
  • 10November 1, 2022
7:00-9:00pm EDT

$2990.00
Enroll Now
Earlybird ends on 10/25
November Session

Nov 15 - Dec 20, 2022 Tuesday & Thursday
  • 1November 15, 2022
  • 2November 17, 2022
  • 3November 22, 2022
  • 4November 29, 2022
  • 5November 31, 2022
  • 6December 6, 2022
  • 7December 8, 2022
  • 8December 13, 2022
  • 9December 15, 2022
  • 10December 20, 2022
7:00-9:00pm EDT

$2990.00
$2990.00
$2840.50
Enroll Now