**You can contact firstname.lastname@example.org for corporate training or group training if you have more than 5 people.
This five week course is an introduction to data analysis with the Python programming language, and is aimed at beginners.
Project Demo Day and Certificates
From the rudimentary building blocks of programming basics, to data manipulation and use of advanced drawing packages, the course ends with a demonstration of a project of your choice on Project Demo Day. On Demo Day you will access and analyze real data, utilizing the tools and skill sets taught to you throughout the course. Upon successful completion of the course and demonstration of your final project, you will qualify for one of three certificates: Extraordinary Standing, Honorable Graduation, and Active Participation.
Certificates are awarded according to your understanding, skill, and participation. No prerequisite needed for the course.
1. Do I have to do three weeks project? Is it required for taking this class?
Students could choose to spend extra 3 weeks with the teaching crew to do a project of their own choices. We are happy to offer assistance and arrange presentation to demo their work.
2. Can I take class online if I am not in NYC?
You can take it onsite or through recorded sessions on Youtube and get timely assistance from teaching crew by google hangout or Skype.
3. If I have to miss some session, how can I make it up?
We record all of our classes and make it available for students right after each class. If you miss a class, you can also get extra help such as office hour or internet support through google hangout or Skype.
Day 1 – Introduction to Python
Python is a high-level programming language.You will learn the basic syntax and data structures in Python. Ipython provides a robust and productive environment for interactive and exploratory computing, which is great tool to do scientific computation and education.
Introduction to Ipython
Basic objects in Python
Variables and self-defining functions
Advanced data structures
Day 2 – Explore deeper with Python
Python is a object-oriented programming language. Learn a little about OOP will help you understand how Python codes work. To do data analysis, the first thing you need to know is how to deal with files which contains data. Sometime the data is dirty and unstructured, you will learn text processing including regular expressions to deal with them.
Classes: introduction to object-oriented programming
How to deal with files
Run Python scripts
Handling and processing strings
Day 3 – Scientific computation tools
There are three modules for scienticfic computation that make Python as powerful as Matlab: Numpy, Matplotlib and Scipy. Numpy, short for Numerical Python, is the foundational package for scientific computing in Python. Matplotlib is the most popular Python library for producing plots and other 2D data visualizations. SciPy is a collection of packages addressing a number of different standard problem domains in scientific computing.
Matplotlib(mainly the sub-module *pyplot*)
Scipy(manily the sub-module *stats*)
Day 4 – Data Visualization
Python can also generate graphics easily by using appropriate tools like *Seaborn* and *ggplot*. Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. ggplot is a famous library port from R, which is a plotting system. ggplot provids a powerful model of graphics that makes it easy to produce complex multi-layered graphics.
Day 5 – Data manipulation with Pandas
Pandas provides rich data structures and functions designed to make working with structured data fast, easy, and expressive. The *DataFrame* object in pandas is just like the *data.frame* object in R. Pandas makes data manipulation(filter, select, group, aggregate, etc.) as easier as in R.