Product Description

Course Overview

Society has been effectively communicating with different forms of text data for centuries and since NLP focuses on this type of data that has been exponentially increasing in the last decade, NLP has become one of the most exciting and rapidly growing sub-fields of Artificial Intelligence (AI) with immense research and practical interest. Organizations have been building and executing different text analytics capabilities to be able to:

Store and process text data efficiently.
Enhance the information extraction from high volume, velocity and variety of data
sources.
Derive insights that are not obvious or feasible through manual human efforts.
Improve the decision-making utilizing different sources of information.
Automate or accelerate time consuming manual processes.
Advance the technology towards more generally applicable human-like AI frameworks.

NLP has had a big leap since 2017 when the large transfer learning models started to become more and more available. Nowadays one can utilize very large Neural Network models that have been trained on massive amount of text data using a piece of code thanks to open-source. This course aims to provide a solid foundation for effectively using these open-source text analytics technologies to be able to create NLP pipelines for different use-cases.

Prerequisites

This course will cover the text analytics starting from very basics and use Python. We keep the code in a Jupyter notebook using functions and will not dive into object-oriented programming, so medium level of Python knowledge suffices to comprehend the course content and assignments.

Certificate

Certificates are awarded at the end of the program at the satisfactory completion of the course. Students are evaluated on a pass/fail basis for their performance on the required assignments.

Students who complete 80% of the homework and attend a minimum of 85% of all classes are eligible for the certificate of completion.

Syllabus

Unit 1: Introduction

An introduction to Natural Language Processing, applications and course overview.
Running notebooks on different environments either on cloud or local machine.
An introduction to Text Analytics (TA) using Python.
String methods in Python.

Unit 2: Retrieving and Processing Text Data - 1

Parsing unstructured data from different type of sources such as pdf, docx and ppt.
How to scrape web to fuel information extraction.
First glance into NLTK cook-book and basics of text processing.

Unit 3: Retrieving and Processing Text Data – 2

Cleaning, normalizing and segmenting text.
Regular Expressions in Python.
Assignment 1 (Scraping, cleaning and indexing youtube/reddit/twitter transcripts).

Unit 4: How Machines Understand Text – 1

Bag-of-Word (BoW) methodology.
Statistical interpretation of natural language via TFIDF.
Semantic and Word Embeddings: How Neural Networks help with capturing context.

Unit 5: How Machines Understand Text - 2

Chronological flow of contextual models: From word2vec to transformers.
Best practices in model selection in NLP.
Assignment 2 (Comparison and visualization of different language models for word similarity).

Unit 6: Supervised Approach in NLP

Supervised vs Unsupervised methods with text data.
Supervised text classification examples using Scikit-learn.
Data labeling and subjectivity in text classification.
Assignment 3 (Why spam classification is an easier problem than sentiment classification?).

Unit 7: Unsupervised Approach in NLP

EDA on text data: You don’t know what you don’t know.
Topic modeling with LDA and Kmeans clustering.
Visualizing text and topics.
Interpretability challenges in unsupervised techniques with natural language.

Unit 8: NLP tasks 1: How to make sense of text data

Language deconstruction with SpaCY.
Name Entity Recognition (NER) example on Medical Records.
Text analytics dilemma: Rule based vs. training based models.
Assignment 4 (Are unsupervised topics subjective? Comparing students’ models and interpretation)

Unit 9: NLP Tasks 2: Transfer Learning Applications

How transfer learning changed the course of NLP.
Huggingface model hub and the power of open-source.
Huggingface pipelines: Text summarization, zero-shot learning and QnA.
Text generation: Why did GPT-3 get so famous?
Domain adaptation through fine-tuning.

Unit 10: NLP Tasks 3: Semantic similarity and NLP in production

Semantic similarity and NLP based textual search using sBERT.
Indexing a text database for faster information extraction.
How does a typical NLP pipeline look like?
Web applications upon NLP pipelines via Streamlit.
Assignment 5 (Selecting a dataset and query that show the difference between rule-based search and semantic search

Campus Location

500 8th Ave Suite 905, New York, NY 10018

Nearby Subways

1 2 3 34th, Penn Station

A C E 34th, Penn Station

N Q R B D F M 34th, Herald Square

Detailed Directions

Instructor

Tolga Akiner

Instructor

Tolga Akiner is a Senior Data Scientist in LexisNexis and has NLP experience in different companies and industries that are pharmaceuticals, healthcare, retail and legal. He holds a Ph.D. degree in Mechanical Engineering where he worked on nanomaterials followed by a post-doctoral research heavily using Machine Learning and Active Learning in Materials Science domain. He previously contributed ‘Practical AI’ course in Udemy covering NLP lectures and blogged in Medium focusing on some practical text analytics applications.

Natural Language Processing for Production (NLP)

Product Description

Course Overview

Prerequisites

Certificate

Demo Lecture

Syllabus

Unit 1: Introduction

Unit 2: Retrieving and Processing Text Data - 1

Unit 3: Retrieving and Processing Text Data – 2

Unit 4: How Machines Understand Text – 1

Unit 5: How Machines Understand Text - 2

Unit 6: Supervised Approach in NLP

Unit 7: Unsupervised Approach in NLP

Unit 8: NLP tasks 1: How to make sense of text data

Unit 9: NLP Tasks 2: Transfer Learning Applications

Unit 10: NLP Tasks 3: Semantic similarity and NLP in production

Campus Location

Instructor

NYC Data Science Academy

Subscribe to our newsletter and stay posted!

Offerings

About

SOCIAL MEDIA