Prerequisite Basic probability and statistics
Instructor Dr. Ghofraniha, Jahan
Starting Date 3/24/2018
Complete Date 5/12/2018
Lecturing time Tuesday 7:30 PM to 9:30 PM
Sunday 1:30 pm to 5:30 PM
Place 1601 McCarthy Blvd., Milpitas, CA 95035

A. COURSE DESCRIPTION This course introduces methods and techniques for using stored data to make decisions. The student will learn data exploration and analysis and learn their patterns, associations, or relationships, and how to use these information for decision making. Fundamentals of Machine Learning such as regression, classification, decision trees, model reduction techniques such as principle component analysis, ensemble learning will be introduced. Specific examples of engineering and businesses using Machine Learning techniques will be given in the course.

The student is required to work on a course projects by using modern data analysis software and cases studies. This course will focus on implementation of ML algorithms using Python and Scikit-learn libraries.

  • To learn how computational procedures and techniques are employed in machine learning.
  • To provide insights into the implementation details of machine learning strategies.
  • To gain hands-on experience with machine learning tools.
Textbook: An Introduction to Statistical Learning with Applications in R
  • Series: Springer Texts in Statistics (Book 103)
  • Hardcover: 426 pages
  • Publisher: Springer; 1st ed. 2013, Corr. 5th printing 2015 edition (August 12, 2013)
  • Language: English
  • ISBN-10: 1461471370
  • ISBN-13: 978-1461471370
  • Hands-on Machine Learning with Scikit-Learn and TensorFlow
  • Paperback: 576 pages
  • Publisher: O'Reilly Media; 1 edition (April 9, 2017)
  • Language: English
  • ISBN-10: 1491962291
  • ISBN-13: 978-1491962299
Week 1: Statistical Learning and machine learning overview
  • Supervised vs. unsupervised learning
  • Assessing model accuracy
  • Introduction to Python and ML libraries
  • Dealing with data
  • Data visualization
  • Data cleaning
  • Selecting and training a model
  • Basic exercise and homework using Python
Week 2: Classification
  • Binary classification
  • Multiclass classification
  • Error analysis
  • Cross validation
  • Measuring accuracy using cross-validation
  • Precision/Recall tradeoff
  • In-class exercise
  • Exercise/homework in Python and scikit-learn library
Week 3: Regression and other linear and quasi-linear model training
  • Linear regression
  • Computational complexity
  • Gradient Descent
  • Batch gradient descent
  • Stochastic gradient descent
  • Polynomial regression
  • Learning curves
  • Regularization
  • Logistic regression

Hw: Midterm project announcement and discussion

Week 3: Tree-based Methods
  • Basis decision trees
  • Classification decision trees
  • Regression decision trees

Hw: Application of decision trees in classification and regression using Python and Scikit-learn library

Week 4: Support Vector Machines
  • Mathematical background, concept of hyperplane in n-dimension
  • SVM classifier
  • SVM with nonlinear decision boundaries
  • SVM with more than two classes
  • SVM example in Python and Scikit-learn

Hw: Using SVM for classification and regression in Python and Midterm project

Week 5: Ensemble learning and Random Forests
  • Voting classifier
  • Bagging and Pasting in Scikit-learn
  • Out-of-bag Evaluation
  • Random Forests
  • Feature importance
  • Boosting
  • AdaBoost
  • Gradient Boosting
  • Stacking

Hw: Comparison of Random forests vs boosting and voting classifier in Python

Week 6: Dimensionality reduction
  • Main Approaches for Dimensionality Reduction
  • Projection
  • Manifold learning
  • PCA
  • Preserving variance
  • PCA for compression
  • Incremental PCA
  • Randomized PCA
  • Kernel PCA
  • Other techniques

Hw: PCA exercise using scikit-learn and final project announcement

Week 7: Introduction to Artificial Neural Networks
  • From biology to Artificial Neurons
  • The perceptron
  • Multi-layer Perceptron and backpropagation
  • Number of hidden layers
  • Number of neurons per layer
  • Activation function
  • Implementation using Python
Week 8: Unsupervised Learning and final project presentations
  • Clustering methods
  • K-mean clustering
  • Hierarchical clustering • Self-organizing Map
  • Kohonen SOM
  • SOM example

Hw: Final project presentation and Final exam