# Machine Learning Foundations | Working with Statistics, Algorithms, and Neural Networks | Math Emphasis

## Machine Learning Foundations | Working with Statistics, Algorithms, and Neural Networks | Math Emphasis

Statistics, Probability, and BistroMathematics is a primer on the mathematics and algorithms used in Data Science and creating the mathematical foundation and building the intuition necessary for solving complex machine learning problems.

The course provides a solid foundation in basic terminology and concepts, extended and built upon throughout the engagement. Processes and best practices are discussed and illustrated through both discussions and group activities. Throughout the course students will be led through a series of progressively advanced topics, where each topic consists of lecture, group discussion, comprehensive hands-on lab exercises, and lab review.

This course reviews key foundational mathematics and introduces students to the algorithms of Data Science.

Students will understand:

• Calculus, Statistics, Probability, and Linear Algebra
• Supervised Learning vs. Unsupervised Learning
• Classification Algorithms including Support Vector Machines, Discriminant Analysis, Nave Bayes, and Nearest Neighbor
• Regression Algorithms including Linear and Logistic Regression, Generalized Linear Modeling, Support Vector Regression, Decision Trees, k-Nearest Neighbors (KNN)
• Clustering Algorithms including k-Means, Fuzzy clustering, Gaussian Mixture
• Neural Networks including Hidden Markov (HMM), Recurrent (RNN) and Long-Short Term Memory (LSTM)
• Dimensionality Reduction, Single Value Decomposition (SVD), Principle Component Analysis (PCA)
• How to choose an algorithm for a given problem
• How to choose parameters and activation functions
• Ensemble methods
• How to apply mathematics and algorithms to solve complex machine learning problems

• Data Science Analysts, Programmers, Administrators, Architects, and Managers
• Skill-level: Introductory (Masters-level Mathematics)

We recommend you have:

• Strong foundational mathematics skills in Linear Algebra and Probability, to start learning about and using basic machine learning algorithms and concepts
• An interest and understanding of how these skills and technologies will be utilized in their organization
• Basic Python Skills. Attendees without Python background may view labs as follow along exercises or team with others to complete them.
• Basic Linux skills, including familiarity with command-line options such as ls, cd, cp, and su

Lesson: Calculus and Linear Algebra Review

• Essential Calculus
• Geometry and Trigonometry
• Derivatives and Slope
• Partial Derivatives
• Power Rule
• Chain Rule
• Linear Algebra and Matrix Math
• Eigenvalues and Eigenvectors

Lesson: Statistics Review

• Essential Statistics
• Mean, Median, Variance, and deviation
• Normal / Gaussian Distribution
• Statistics in Machine Learning

Lesson: Probability Review

• Probability Theory
• Discrete Probability Distributions
• Continuous Probability Distributions
• Measure-Theoretic Probability Theory
• Central Limit and Normal Distribution
• Markov Chains
• Probability Density Function
• Probability in Machine Learning

Session: Machine Learning and Algorithms

• Session: Big Data and NoSQL
• The Big Data Era and the Internet of Things
• NoSQL Data Systems
• Data Transformation Pipelines

Lesson: Supervised Learning

• Supervised Learning Explained
• Classification vs. Regression
• Examples of Supervised Learning
• Key supervised algorithms (overview)

Lesson: Unsupervised Learning

• Unsupervised Learning
• Clustering
• Examples of Unsupervised Learning
• Key unsupervised algorithms (overview)

Lesson: Regression Algorithms

• Linear Regression
• Logistic Regression
• Generalized Linear Modeling
• Support Vector Regression
• Decision Trees
• Random Forests

Lesson: Classification Algorithms

• Bayes Theorem and the Nave Bayes classifier
• Support Vector Machines
• Discriminant Analysis
• k-Nearest Neighbor (KNN)

Lesson: Clustering Algorithms

• k-Means Clustering
• Fuzzy Clustering
• Gaussian Mixture Models

Lesson: Neural Networks

• Neural Network Basics
• Hidden Markov Models (HMM)
• Recurrent Neural Networks (RNN)
• Long-Short Term Memory Networks (LSTM)

Lesson: Dimensionality Reduction

• Dimensionality Reduction Goals and Techniques
• Recap: Eigenvalues and Eigenvectors
• Single-Value Decomposition (SVD)
• Principle Component Analysis (PCA)
• Linear Discriminant Analysis (LCA)
• Generalized Discriminant Analysis (GDA)

Session: Best Practices and the Real World

• Lesson: Choosing Algorithms
• Choosing between Supervised and Unsupervised algorithms
• Choosing between Classification Algorithms
• Choosing between Regressions
• Choosing Neural Networks
• Choosing Activation Functions

Lesson: Ensemble Methods

• Ensemble Theory and Methods
• Ensemble Classifiers
• Bucket of Models
• Boosting
• Stacking

Lesson: In the Real World

• Machine Learning in Python: NumPy, Pandas, SciKit-ML, and MatPlotLIb; NLTK, Keras
• Machine Learning in R
• Machine Learning in Java
• Machine Learning with Apache Madlib