NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship ๐Ÿ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lesson
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular ๐Ÿ”ฅ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New ๐ŸŽ‰ Generative AI for Finance New ๐ŸŽ‰ Generative AI for Marketing New ๐ŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular ๐Ÿ”ฅ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular ๐Ÿ”ฅ Data Science R: Machine Learning Designing and Implementing Production MLOps New ๐ŸŽ‰ Natural Language Processing for Production (NLP) New ๐ŸŽ‰
Find Inspiration
Get Course Recommendation Must Try ๐Ÿ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lessons
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Capstone > NINKASI: Beer Recommender System

NINKASI: Beer Recommender System

Alex Yuan Li, Xinyuan Wu, Nelson Chen and Luke Chu
Posted on Jan 7, 2017

Click here for the Ninkasi Web App

Introduction

An ubiquitous drink throughout the ages, beer is the source of pleasure and the complement for celebration when consumed in moderation. Ninkasi, the ancient Sumerian goddess of beer, exists to satisfy the desire and sate the heart. Likewise, our recommender system, named after the goddess, aims to deliver personalized recommendations to beer lovers across the country. Our final product consists of web scraped data, a content-based natural language processing model, two different collaborative filtering models using Singular Value Decomposition++ (SVD++) and Restricted Boltzmann Machines (RBM), all packed into an interactive Flask web-application. Our purpose is twofold: to create a recommender system of something fun for others to use and in the process learn how to build a complex recommender system.

Web Scraping Process

We obtained the data from ratebeer.com, a popular beer website for beer lovers to rate and review their favorite beer. The scraping tool we used was Scrapy, a python based web crawling package. Two separate dataset was scraped, one dataset containing all the information about the beer, and the second dataset containing all the information about the user and the review. We decided to scrape US beer only, limited to the top 20-25 beers produced in every states plus District of Columbia, a total of about 280, 000 user reviews were scraped.

website1 website2

The left image shows all the scraped beer information and the right image shows all the scraped review information. This is a detailed table for all the fields obtained,

Beer Information Beer Review Information
Beer Name User Name
Brewer Name User Location
Weighted Average Time of Review
Beer Image User Rating
State Beer Produced Aroma
Overall Score Appearance
Beer Style Taste
Alcohol by Volume Palate
Estimated Calorie Overall
IBU (Intโ€™l Bitter Unit) Review
Beer Description

 

Exploratory data analysis

We split our EDA into three parts:

Beer Information:

Our dataset consists of  1269 beers taken from the top 20-25 beers of each state and Washington D.C.  Of these beers, there are 58 styles and 338 unique brewers:

testscreen-shot-2017-01-09-at-8-36-25-pm

Of the statistics, over half the beers had missing values for IBU and mean because the information was not available. Here are some violin plots:

screen-shot-2016-12-19-at-4-27-30-am

screen-shot-2016-12-19-at-4-27-18-amscreen-shot-2016-12-19-at-4-28-08-am

 

The weighted average looks to be fairly normal with mean around 3.7 while both style score and overall score are heavily skewed toward 100. Per the website, the style and the overall score are scores relative to other beers, and as we chose the top few beers of each state, it makes sense that they are skewed as such. The % abv has a median between 8 and 9, and ranging from a bit under 2.5 % abv to close to 20 % abv. F or estimated calories we have two peaks, one around 50 calories and one around 250 calories. This is perhaps a separation between so-called light beers and heavier beers.   

User Review and Scoring information:

Our dataset consists of  ~278,000 reviews written by ~17,000 users. ~9000 users, over half, wrote only 1 or 2 reviews. (Perhaps a recommender system would increase usage). The dates of the reviews range from 2016-12-04 to 2000-04-26. This means that some beers have stayed on the top 20-25 list for quite a while in their respective state. Also, there may have been quite some changes to which beers are in the top list. This combined with the number of users who have written only 1 or 2 reviews may not be indicative of the actual number of users who have used the website.

Review Information:

After cleaning, our data set consists of approximately 11,370,000 words, or enough words to fill 150 350-page books. This equates to about 40 words per review without stop words. There are only approximately 127,600 unique words, or 1.1% of all words.  The top 100 words by frequency are:

screen-shot-2016-12-19-at-2-54-29-am

And the top 100 distinct words by TF-IDF score,  in no particular order as a word could appear multiple times in the top 100:

screen-shot-2016-12-19-at-3-28-49-am

Lastly, we ran a cursory LDA model with 5 topics and 40 pass-throughs. Though we chose 5 topics, only the first two topics had proportions that were not almost zeroed out:

screen-shot-2017-01-09-at-10-23-00-pm

The first topic looks to describe dark beers such as a stout or porter, infused with flavors usually found in those type of beers, such as coffee, chocolate, and vanilla. The second topic seems to describe fruity and hoppy beers, with a golden color.

Recommender system

With 1300 beer information and 280,000 review data in hand, we started to build our recommendation engine. Our plan is to ensemble multiple recommenders to give users the best beer options based on their preference.

 

screen-shot-2017-01-04-at-1-55-57-pm

In this section, we expand the discussion on the techniques we used to build the system. As can be seen from the figure above, our recommendation system includes a content-based method which takes advantage of user reviews, and a collaborative filtering engine based on user rating. Finally we combine all three techniques in a Flask web app.

Content-based method

The idea of content-based recommendation system can be summarized as follows: (1) User select a beer that he/she likes. (2) Recommendation system find a list of beers that are most similar to the user input. (3) Output results.

screen-shot-2017-01-04-at-1-57-49-pm

Figure above demonstrates the workflow. The whole process involves Natural Language Processing (NLP), calculating Term Frequency-Inverse Document Frequency (TF-IDF) score, and Performing Latent Semantic Analysis (LSA). Finally cosine similarity matrix for all documents are constructed.

Text Processing

Given 280,000 reviews from users, we need to have them cleaned, corrected and organized into meaningful pieces before performing numeric analysis. Therefore the goal of text preprocessing is to generate a group of words for each beer. Python NLTK package was used throughout the process.

textprocessing

Above figure demonstrates the workflow of text preprocessing. For each review string, it is first encoded by ASCII, then all  garbled symbols and punctuations inside the string are washed off using regular expression. Having all unimportant symbols removed, the review string is tokenized into groups of words. Next, spell check and auto corrections are performed to each word. In order to facilitate weight calculation, stopwords are removed due to having less importance. Finally every word is lemmatized to unify the tense for each verb. The final output is a list of corpuses, with each corpus pointing to each beer name.

TF-IDF and LSI

TF-IDF is a numeric statistic that intended to reflect how important a word is in a collections of corpus. The tf-idf value of a word increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to adjust for the fact that some words appear more frequently in general. The overall score is by direct multiplying TF points with IDF points. The calculation is performed using the gensim Python package, using the formula given below.

tfidf

TF-IDF calculation generates a document(beer)-term(word) matrix, with each cell corresponding a weight of a word for a beer. Due to the size of the matrix, we select the top-500 terms according to their weights to build the recommendation system.

Before creating similarity matrix among all beers, Latent Semantic Indexing (LSI) was implemented to obtain a low-rank approximation of the document-term matrix.

lsi

What LSI does is performing Singular Value Decomposition to the document-term matrix, and pick out the first k rank-1 components that contribute the most to the original matrix. The advantage of such approximation can not only reduce the noise of document-term matrix, but also alleviate synonymy and polysemy parsing in natural language.

Collaborative filtering method

Single Value Decomposition++

When working with ratings data, it is often useful to look at the data as a user-item matrix, where each user is a row and each item is a column. The matrix will have values in the spots where the user rated an item, but will be missing a value otherwise. Because of the large number of items, it is often the case the matrix be very sparse since it is unlikely that a single user have tried a majority of the items.

Example of user-item matrix

Example of user-item matrix from dataprespective.info

In collaborative filtering, the goal is to impute those missing values by analyzing the user-item interactions. If two users share common ratings for some items, the recommendation system will assume that the two have similar preferences and will try to recommend items from each otherโ€™s list that the other did not already rate. This idea can be expanded to consider every user together as one cohesive group. One powerful technique is to factorize the matrix into a user latent factor matrix multiplied by an item latent factor matrix.

svd

                                  Single Value Decomposition for Collaborative Filtering from Databricks

The idea here is to approximate a single value decomposition (SVD) that would contain the hidden features to explain why a user rated the items the way that they did. In other words, the SVD is trying to extract meaning behind their ratings. This algorithm approximates the SVD by treating the user and item latent factor matrices as parameters and training them to minimize the root-mean-square error for the known ratings. We used Sparkโ€™s MLlib ALS package on our data to get a benchmark of 10.75% error.

From the winners of the Netflix Challenge in 2009, an enhanced version of this technique was developed to decompose the ratings even further. Instead of just user and item latent factors, the model seeks to also model each userโ€™s and itemโ€™s bias (deviance from the average of all ratings). For example, if two user both like IPA beers, but the first user just so happens to rate things 2 points higher on average, the system will think that these two users have different tastes. The same is also true between items. In order to combat this, the new algorithm explicitly models the biases in an effort to recenter each user and item at the global average. This allows the latent factors to be estimated correctly without worrying about bias. SVD++ goes one step further to also include the number of ratings a user has. The goal of this is to penalize users that have few ratings so that the predicted ratings are more conservative since there is less information for these users.

screen-shot-2017-01-04-at-3-07-48-pm

SVD++ ratings equation

Unfortunately, we did not find any standard SVD++ packages publicly available. However, after studying the algorithm further, we realized that the algorithm is essentially matrix operations, and therefore can be efficiently programed using TensorFlow, Googleโ€™s computational framework for tensor objects and operations.tensorgraphFurthermore, we were able to find a regular SVD script using TensorFlow on github, so we did not have to start from scratch. Our implementation of SVD++ involved augmenting this script to include the biases and implicit feedback in the algorithm. Additionally, we added in features such as k-fold cross validation and early stopping. This allowed us to program at a higher level while benefitting from the C++ backend and GPU speedup. With the additional use of the GPU alone gave us a 5x boost. Further testing will have to be done to see how much TensorFlowโ€™s backend have improved our efficiency. The end result after doing a hyper parameter grid search and 10-fold cross validation was an error rate of 8.25%.

The graph on the right is the computational graph for our TensorFlow code. User data is fed through the bottom node and travels through the graph undergoing transformations and operations at each node to finally produce predicted ratings and losses at the terminal nodes.

Restricted Boltzmann Machine (RBM)

RBM is an unsupervised, generative, stochastic, two-layer neural network. It is derived from the Boltzmann Distribution from statistical mechanics, which is a probability distribution that measures the state of a system. It is โ€œrestrictedโ€ because the neurons must form a bipartite graph, which the neurons form two distinct group of units, visible V and hidden H. There are symmetric connections between the two group and no connections within each group, thus effectively forming two layers.

rbm

RBM performs collaborative filtering by reconstruct the user-item matrix, filling out the missing rating for all user. Consider a dataset of M items and N users, we are treating every user as a single training case for an RBM, therefore we can initialize the visible layer to be a vector with M element, where each element or neuron represents an item with value equals to the rating if it has been rated, and zero if not. The visible neurons are modeled from a conditional multinomial distribution of K distinct rating scores, thus a โ€œsoftmaxโ€ activation is used. The hidden layer is a latent layer where the hidden neurons can be perceived as binary latent features, therefore a conditional Bernoulli distribution is used to model the hidden layer with a sigmoid activation function.

activation img

To train a RBM, the initialized visible layer is propagated through the RBM, parameterized by the weight W, visible bias bi, and hidden bias bj. After we obtain the values of the hidden layer, it is propagated back to the visible layer from the updated parameters. The visible layer is fully reconstructed and all the missing ratings are imputed. The objective function often used for training RBM is called โ€œContrastive Divergenceโ€, it is looking to minimizing the energy function by maximizing log-likelihood of the marginal distribution over visible rating V. For our model, a stochastic gradient ascent with momentum is used as the minimizer.

For RBM, all the weights and bias are constant across all user training cases, for example, if two users rated the same item, they must share the same weight and bias for that item between visible and hidden layers.

Ensemble of CF Algorithms and Final Recommendation

The two fully reconstructed user-item matrix obtained from the two CF algorithms are ensembled linearly to generate the final user-item matrix for new user prediction. To give recommendations for new user, we used a neighborhood approach. After receiving the new user rating vector, the cosine similarity between the new user vector and the user-item matrix is computed to impute the missing ratings in the user vector using a weighted average proportional to similarity matrix. The final recommendations is obtained from the items with the highest similarity scores.

Flask App

To make our recommender system accessible to the everyday user, we decided to make a Flask App. Both the content recommender system and collaborative filtering system are accessible using the app. Fun fact: the earliest known recipe for beer is an ode to Ninkasi dating back 3900 years.

Future improvements

  • Use more features such as style, palate, appearance, aroma, taste, and time
  • Add in more data from other regions and extend beyond the top 25
  • Do position of speech tagging for NLP to fine tune dataset even more
  • Tune hyperparameters even more to get optimal single models
  • Ensemble smarter by using a  minimizer or stacking our results
  • Develop the web application to improve aesthetics and functionalities

 

About Authors

Alex Yuan Li

Alex Li received his Master of Science in Mechanical Engineering at Columbia University and Bachelor of Science in Chemical Engineering at University of Utah with professional experience in Research and Development in the Medical Device industry. He discovered...
View all posts by Alex Yuan Li >

Xinyuan Wu

Xinyuan recently obtained his Ph.D. from North Carolina State University. He gained quantitative analysis, statistical knowledge and critical thinking from years of research on magnetic and photophysical chemistry. His belief in the trend of predictive analysis, along with...
View all posts by Xinyuan Wu >

Nelson Chen

Nelson has a Bachelor's degree from Northwestern University and a Master's degree from University of California, Berkeley in Mechanical Engineering. His graduate work specialized in developing and applying new Computational Fluid Dynamic algorithms to astrophysical fluid dynamic problems...
View all posts by Nelson Chen >

Luke Chu

Luke is an aspiring data scientist with a background in Applied Mathematics, Mathematical Modeling, and Programming. He started learning data science due to his love of game and sport statistics and exposure from his prior job at Riot...
View all posts by Luke Chu >

Related Articles

Capstone
Catching Fraud in the Healthcare System
Capstone
The Convenience Factor: How Grocery Stores Impact Property Values
Capstone
Acquisition Due Dilligence Automation for Smaller Firms
Machine Learning
Pandemic Effects on the Ames Housing Market and Lifestyle
Machine Learning
The Ames Data Set: Sales Price Tackled With Diverse Models

Leave a Comment

Cancel reply

You must be logged in to post a comment.

Google January 4, 2021
Google Below youย’ll find the link to some websites that we feel you ought to visit.
Google December 20, 2020
Google Please pay a visit to the sites we stick to, such as this one particular, because it represents our picks through the web.
Google March 14, 2020
Google Here are some links to websites that we link to mainly because we assume they are worth visiting.
Google October 21, 2019
Google Wonderful story, reckoned we could combine some unrelated data, nevertheless really worth taking a search, whoa did 1 study about Mid East has got far more problerms as well.
dating websites for Lgbt January 24, 2017
This piece of writing will assist the internet people for creating new blog or even a blog from start to end. I don't regularly share thoughts .. Just spread this to my Social Media.
LOV January 18, 2017
I feel that is one of the such a lot significant info for me. And i'm happy reading your article. However want to commentary on few general issues, The site taste is ideal, the articles is really excellent : D. Just right process, cheers Premier time commenting. Instantly spread this to my Instagram.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    ยฉ 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application