Data science has become the central approach to tackling data-heavy problems in both business and academia. In this course, students learn how data science is done in the wild, with a focus on data acquisition, cleaning, and aggregation, exploratory data analysis and visualization, feature engineering, and model creation and validation. Students use the R statistical programming language to work through real-world examples that illustrate these concepts. Concurrently, students learn some of the statistical and mathematical foundations that power the data-scientific approach to problem solving.
This class will be an introduction to the statistical programming language R for business analysts. We’ll explore data science use cases in the business realm and use R for data wrangling, data mining, visualization and prediction. Throughout the class we will be approaching business problems analytically and we’ll use R to explore data, make better business decisions and identify areas for improving performance. The combination of data analytics, R and the data science process will provide the foundation for using R for data science business problems. Students should come prepared with an understanding of computer programming and a curiosity for data science.
- Course Time: Mondays & Wednesdays | 7:00 PM to 9:30 PM EST
- Venue: 500 8th Ave., Suite 905, New York, NY 10018 (5 min from Penn Station)
Data Science For Business Analysts from NYC Data Science Academy on Vimeo.
Who Is This Course For?
Students should have some experience with programming and have some familiarity with basic statistical and linear algebraic concepts such as mean, median, mode, standard deviation, correlation, and the difference between a vector and a matrix. In R, it will be helpful to know basic data structures such as data frames and how to use R Studio.
Students should complete the following pre-work (approximately 2 hours) before the first day of class:
- R Programming – https://www.rstudio.com/online-learning/#R
- R Studio Essentials Programming 1: Writing Code https://www.rstudio.com/resources/webinars/rstudio-essentials-webinar-series-part-1/
- An understanding of data science business problems solvable using R and an ability to articulate those business use cases from a statistical perspective.
- The ability to create data visualization output with Rmarkdown files and Shiny Applications.
- Familiarity with the R data science ecosystem, strategizing and the various tools a business analyst can use to continue developing as a data scientist
Certificates are awarded at the end of the program at the satisfactory completion of the course.
Students are evaluated on a pass/fail basis for their performance on the required homework and final project (where applicable). Students who complete 80% of the homework and attend a minimum of 85% of all classes are eligible for the certificate of completion.
Unit 1: Data Science and R Intro
- Big Data
- Data Science
- Roles in Data Science
- Use Cases
- Class Format overview
- R Background
- R Intro
- R Studio
Unit 2: Visualize
- Rules of the road with data viz
- Chart junk
- Chart terminology
- Clean chart
- Scaling data
- Data Viz framework
- Code plotting
Unit 3: R Markdown
- Presenting your work
- R markdown file structure
- Code chunks
- Generating a R markdown file
- Rmarkdown Exercise
Unit 4: Shiny
- Shiny structure
- Reactive output
- Rendering Output
- Stock example
- Hands-on challenge
Unit 5: Data Analysis
- How to begin your data journey?
- The human factor
- Business Understanding
- EDA – Exploratory Data Analysis
- Data Anomalies
- Data Statistics
- Key Business Analysis Takeaways
- Diamond data set exercise
- Hands on challenge with Bank Marketing
Unit 6: Introduction to Regression
- Regression Definition
- Examples of regression
- Formulize the formula
- Statistical definitions involved
- mtcars regression example
- Business use case with regression
Unit 7: Introduction to Machine Learning
- ML Concept
- Types of ML
- CRISP Model
- Titanic Example
- Decision Trees
- Feature Engineering
Unit 8: Strategy
- Data Driven Decision Making
- Data Science Strategy
- Strategy Fails
- Macroeconomic strategy
- Data Science Project
- Data Impact
- Project guide
- Opportunities for improvement
- Big Box Store Strategic Exercise