NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship ๐Ÿ† Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lesson
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular ๐Ÿ”ฅ Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New ๐ŸŽ‰ Generative AI for Finance New ๐ŸŽ‰ Generative AI for Marketing New ๐ŸŽ‰
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular ๐Ÿ”ฅ Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular ๐Ÿ”ฅ Data Science R: Machine Learning Designing and Implementing Production MLOps New ๐ŸŽ‰ Natural Language Processing for Production (NLP) New ๐ŸŽ‰
Find Inspiration
Get Course Recommendation Must Try ๐Ÿ’Ž An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release ๐ŸŽ‰
Free Lessons
Intro to Data Science New Release ๐ŸŽ‰
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See ๐Ÿ”ฅ
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Community > Data Science Bootcamp Pre-Work NYC Academy

Data Science Bootcamp Pre-Work NYC Academy

Vivian Zhang
Posted on Oct 13, 2015

Data Science

Andrew Nichols, NYC Data Science Academy

July 9, 2015

  • Where to Start Setup

  • Overall Goals

  • Command Line

  • Git and Github

  • Foundational Statistics

  • R

  • Python

  • Machine Learning

  • Proprietary Platforms

Where to Start

Data Science is built around 3 concepts: programming, statistics, and domain expertise. In preparing this prework guide, we focused on establishing a strong programming foundation that can later be enriched with statistical learning methods so you can apply them to your own domain expertise.

The focus of this guide is on R and Python. R is a statistical programming language developed by statisticians and Python is a more general programming language used in a variety of disciplines for solving a wide range of problems. Our curriculum focuses on leveraging both languages so their strengths can balance out each otherรขย€ย™s weaknesses and also prepare you to work as a data scientist in either environment.

We believe in the power of free and open-source content. The tech and data analytics communities are both moving towards a more democratic access to technology and learning. The bulk of this content will focus on these technologies with a short section at the end dedicated to proprietary products.

Depending on your experience, you can determine which sections will require more of your time. We have placed time guides to help you understand how much emphasis and time you should spend on each section. For bootcamp students, we want you to at least have read รขย€ยœAn Introduction To Statistical Learningรขย€ย and have had some exposure to the command line, git, and foundational knowledge in stats, Python, and R.


Setup

You should install R and Python on your computer.

In addition to R, you should also install RStudio which will be used as an integrated development environment (IDE) that makes programming in R faster and easier.

We will be using Python 2 since it will take several years for Python 3 to fully replace it. Python 2 is still used, and will be continued to be used across the data science industry until enough libraries have been updated and the new working ecosystem is ensured to be stable. The Anaconda distribution of Python contains most of the libraries you will need to get started.


Overall Goals

  1. Understand what data science is, develop the appropriate vocabulary, and learn where to get help.
  2. Learn about version control and why itรขย€ย™s needed.
  3. Gain knowledge about storing and accessing your data in databases.
  4. Develop a working knowledge of R and Python.
  5. Develop a working knowledge of statistics and machine learning.

Command Line

2 to 4 hours

Goals:

  1. Navigate filepaths.
  2. Basic commands for using your terminal.

(The Command Line Crash Course)[http://cli.learncodethehardway.org/book/]


Git and Github

2 to 4 hours

Version Control is a system that allows you to track changes and recall previous versions of old files. In the data science and tech communities, Git and Github have become the industry-wide standard. In many ways, Github accounts have become equivalent to technical resumes, especially for those who are trying to break into the industry.

Goals:

  1. Understand what version control is.
  2. Understand what git and github are.
  3. Install git and open an account on github.
  4. Work with git: cloning, commiting, etc.

Installing Git

Git Immersion - Full length tutorial

Git - The Simple Guide - One page tutorial

Git Cheat Sheet


Foundational Statistics

5 to 8 hours

Goals:

  1. Understand foundational statistical concepts.

Make sure you understand these basic ideas:

Population / Sample
Distributions
Discrete / Continous
Distributon Functions
Null Hypothesis / Alternative Hypothesis / P-value
Mean / Variance / Skewness / Kurtosis / Percentile / Quantile
T-Test / F-Test / Chi-Square Test / ANOVA / Normality Test

If you have more time:

Probability and Statistics

Book Recommendations

  • Stats in a Nutshell - Basic Statistics Foundation

R

20 to 40 hours

Goals:

  1. Learn the basics of R Syntax.

Swirl - Learn R in the console

Codeschool: TryR - Online Interactive Learning

Book Recommendations

  1. R in a Nutshell
  2. The Art of R Programming

Python

20 to 40 hours

Goals:

  1. Learn the basics of Python Syntax.

Learn Python the Hard Way

We recommend this book above all others. The book is available online for free or you can buy the book.

Codecademy

Learn Python Interactive Tutorial


Machine Learning

As much time as you have, after finishing the prior sections.

Goals:

  1. Understand the basics of machine learning.
  2. Read Introduction to Statistical Learning.

Basic Ideas:

Prediction / Inference
Parametric / Nonparametric
Supervised / Unsupervised
Linear Regression / Logistic Regression
Regression / Classification

Introduction to Statistical Learning - The best book for learning the foundation for more advanced machine learning. Available for free online or you can buy the book.


Proprietary Platforms

Optional

This section is not required and is solely to give you an idea of the proprietary platforms that exists should you read about or hear them in a conversation.

  • Tableau
    Tableau is a platform that removes the technical difficulty of visualizing data and is designed for non-technical users so they can perform data analysis. It integrates with R and Python and is not designed to be a replacement, in particualr due to its limitations on cleaning, manipulating data, and modeling. Tableau shines when you need to quickly make beautiful visualizations. In some ways, it is akin to a video editor using iMovie. It definitely has its place, but it might not be for everyone. The best way to know is to try it.
  • SAS and SPSS
    SAS and SPSS are both analytics software that are on their way out for various reasons. They are powerful tools, but they are facing difficulty in the market due to their cost, and in turn the difficulty of getting access for learning purposes. Furthermore, SAS and SPSS are not open-source which limits growth since users cannot build upon and improve the software.
  • CartoDB
    CartoDB allows you to quickly make beautiful maps with your data. Like Tableau, this platform abstracts away the technical difficulty. There is a free version of the software which is good for the average user especially if you just need a quick map for a presentation or for exploratory analysis.

About Author

Vivian Zhang

Vivian Zhang is the CTO and School Director of the NYC Data Science Academy. She started the NYC Open Data meetup group. She earned her M.S. in Computer Science and Statistics and B.S. in Computer Science. She is...
View all posts by Vivian Zhang >

Leave a Comment

Cancel reply

You must be logged in to post a comment.

Google June 22, 2021
Google Here are some of the websites we advise for our visitors.
Google September 24, 2019
Google Very few web sites that transpire to become in depth beneath, from our point of view are undoubtedly effectively really worth checking out.
Google September 18, 2019
Google We came across a cool site that you just could delight in. Take a appear when you want.
Celinda August 18, 2017
Great post. I'm facing a couple of these issues.
Octavio June 1, 2017
Fairly! This was a really excellent post. Thank you for your supplied advice
Del May 24, 2017
Hi , I do believe this is an excellent site. I stumbled upon it on Yahoo , I shall return once again.
Jamel May 24, 2017
Hello , I do consider this is a superb site. I stumbled upon it on Yahoo , I 'll return once again.
Juliann May 22, 2017
This is an extremely good tips especially to those new to blogosphere, brief and precise information... Thanks for sharing this one. A must read post.
http://nomen.ir/index.php/component/easyblog/blogger/listings/17987-soncall1385185?Itemid=435 March 27, 2016
Het CPE verliest iedere paar seconden (of minuten) de verbinding met de provider. America compared along with other class A Asian countries like South Korea, Hong Kong and Japan have fallen to its ranking according towards the.
geburtstagswunsche zum 18 geburtstag February 12, 2016
Pretty! This was an extremely wonderful post. Thank you for providing these details.
Printable calendar January 19, 2016
I believe what you said made a lot of sense. But, what about this? what if you added a little information? I mean, I don't want to tell you how to run your blog, however suppose you added something that grabbed a person's attention? I mean blog topic is a little vanilla. You might look at Yahoo's front page and see how they create post titles to get viewers to open the links. You might add a video or a related pic or two to get readers interested about what you've got to say. Just my opinion, it might make your posts a little livelier.
https://youtube.com/channel/UCF2fwRDOeM1OlXcv_JLBlVg January 17, 2016
เธ—เธฒเธ‡เธเธฒเธฃเธเธฃเธกเน€เธˆเน‰เธฒ https://youtube.com/channel/UCF2fwRDOeM1OlXcv_JLBlVg เธžเธฃเน‰เธญเธกเธ—เธฑเน‰เธ‡เธ•เธฃเธฑเธขเน€เธญเธฒเนƒเธˆเธŠเนˆเธงเธขเธ•เธฑเน‰เธ‡เนƒเธˆเน„เธ›เธชเธฑเธกเธ เธฒเธฉเธ“เนŒเธฃเธฐเธงเธฒเธ‡เธเธฒเธฃเธ—เธณเธ‡เธฒเธ™เธ‚เธ“เธฐเธ—เธตเนˆเน€เธˆเน‰เธฒเธซเธฅเนˆเธญเธ™ https://youtube.com/channel/UCF2fwRDOeM1OlXcv_JLBlVg เน„เธ”เน‰เธกเธฒเน€เธŠเธทเนˆเธญเธœเธนเน‰เธ›เธเธดเธšเธฑเธ•เธดเธ‡เธฒเธ™เธ‚เน‰เธฒเธงเธ‚เธญเธ‡เนเธกเนˆเธ™เธฒเธ‡เธซเน‰เธงเธ™เน†เธชเธฃเธฃเธžเธชเธดเนˆเธ‡เน€เธˆเน‰เธฒเน€เธเธตเนˆเธขเธงเธเธฑเธš https://youtube.com/channel/UCF2fwRDOeM1OlXcv_JLBlVg เธเธฑเธšเธ‚เนˆเธฒเธงเธชเธ”เธชเธธเธ”เน†เน€เธซเธ•เธธเธ”เน‰เธงเธขเธ™เธฒเธ‡เธ‡เธฒเธกเธกเธฐเธžเธฅเธฑเธšเนเธเธ” https://youtube.com/channel/UCF2fwRDOeM1OlXcv_JLBlVg เธกเธทเธญเธ‡เธฒเธ™เธญเนˆเธฒเธ™เธญเธตเน€เธกเธฅเธšเธญเธญเธเธ™เธซเธ™เธถเนˆเธ‡เนƒเธ™เธ•เธญเธ™เธ—เธตเนˆเธ‡เธฒเธ™เน€เธ›เธฅเธตเนˆเธขเธ™เนเธ›เธฅเธ‡เธœเธนเน‰เธ”เธนเธ—เธงเธดเธ•เน€เธ•เธญเธฃเนŒเธชเธดเนˆเธ‡เนเธกเนˆเธ™เธฒเธ‡เธ‰เธดเธงเน€เธ‰เธตเธขเธงเนƒเธ™
Printable Calendar 2016 December 16, 2015
Highly energetic article, I loved that bit. Will there be a part 2?

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    ยฉ 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application