Skip to content

ChristophSchmidl/statistical-machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Statistical Machine Learning

This repository contains all assignments and personal notes for the course "Statistical Machine Learning" (NWI-IMC056) given at the Radboud University.

The chapters and exercises are based on the book Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg and is used throughout this course.

The "notebooks" folder contains Jupyter notebooks with interactive examples and additional details based on the discussed chapters.

Chapters covered in this course

Chapter 1 - Introduction

1.1 - Example: Polynomial Curve Fitting

1.2 - Probability Theory

  • 1.2.1 - Probability densities
  • 1.2.2 - Expectations and covariances
  • 1.2.3 - Bayesian probabilities
  • 1.2.4 - The Gaussian distribution
  • 1.2.5 - Curve fitting re-visited
  • 1.2.6 - Bayesian curve fitting

1.3 - Model Selection

1.4 - The Curse of Dimensionality

1.5 - Decision Theory

  • 1.5.1 - Minimizing the misclassification rate
  • 1.5.2 - Minimizing the expected loss
  • 1.5.3 - The reject option
  • 1.5.4 - Inference and decision
  • 1.5.5 - Loss functions for regression

1.6 - Information Theory

  • 1.6.1 - Relative entropy an mutual information

Chapter 2 - Probability Distributions

2.1 - Binary Variables

  • 2.1.1 - The beta distribution

2.2 - Multinominal Variables

  • 2.2.1 - The Dirichlet distribution

2.3 - The Gaussian Distribution

  • 2.3.1 - Conditional Gaussian distributions
  • 2.3.2 - Marginal Gaussian distributions
  • 2.3.3 - Bayes' theorem for Gaussian variables
  • 2.3.4 - Maximum likelihood for the Gaussian
  • 2.3.5 - Sequential estimation
  • 2.3.6 - Bayesian inference for the Gaussian
  • 2.3.7 - Student's t-distribution
  • 2.3.8 - Periodic variables
  • 2.3.9 - Mixtures of Gaussians

2.4 - The Exponential Family

  • 2.4.1 - Maximum likelihood and sufficient statistics
  • 2.4.2 - Conjugate priors
  • 2.4.3 - Noninformative priors

2.5 - Nonparametric Methods

  • 2.5.1 - Kernel density estimators
  • 2.5.2 - Nearest-neighbour methods

Chapter 3 - Linear Models for Regression

3.1 - Linear Basis Function Models

  • 3.1.1 - Maximum likelihood and least squares
  • 3.1.2 - Geometry of least squares
  • 3.1.3 - Sequential learning
  • 3.1.4 - Regularized least squares
  • 3.1.5 - Multiple outputs

3.2 - The Bias-Variance Decomposition

3.3 - Bayesian Linear Regression

  • 3.3.1 - Parameter distribution
  • 3.3.2 - Predictive distribution
  • 3.3.3 - Equivalent kernel

3.4 - Bayesian Model Comparison

3.5 - The Evidence Approximation

  • 3.5.1 - Evaluation of the evidence function
  • 3.5.2 - Maximizing the evidence function
  • 3.5.3 - Effective number of parameters

3.6 - Limitations of Fixed Basis Functions

Chapter 4 - Linear Models for Classification

4.1 - Discriminant Functions

  • 4.1.1 - Two classes
  • 4.1.2 - Multiple classes
  • 4.1.3 - Least squares for classification
  • 4.1.4 - Fisher's linear discriminant
  • 4.1.5 - Relation to least squares
  • 4.1.6 - Fisher's dicriminant for multiple classes
  • 4.1.7 - The perceptron algorithm

4.2 - Probabilistic Generative Models

  • 4.2.1 - Continuous inputs
  • 4.2.2 - Maximum likelihood solution
  • 4.2.3 - Discrete features
  • 4.2.4 - Exponential family

4.3 - Probabilistic Discriminative Models

  • 4.3.1 - Fixed basis functions
  • 4.3.2 - Logistic regression
  • 4.3.3 - Iterative reweighted least squares
  • 4.3.4 - Multiclass logistic regression
  • 4.3.5 - Probit regression
  • 4.3.6 - Canonical link functions

4.4 - The Laplace Approximation

  • 4.4.1 - Model comparison and BIC

4.5 - Bayesian Logistic Regression

  • 4.5.1 - Laplace approximation
  • 4.5.2 - Predictive distribution

Chapter 5 - Neural Networks

5.1 - Feed-forward Network Functions

  • 5.1.1 - Weight-space symmetries

5.2 - Network Training

  • 5.2.1 - Parameter optimization
  • 5.2.2 - Local quadratic approximation
  • 5.2.3 - Use of gradient information
  • 5.2.4 - Gradient descent optimization

5.3 - Error Backpropagation

  • 5.3.1 - Evaluation of error-function derivatives
  • 5.3.2 - A simple example
  • 5.3.3 - Efficiency of backpropagation
  • 5.3.4 - The Jacobian matrix

5.4 - The Hessian Matrix

  • 5.4.1 - Diagonal approximation
  • 5.4.2 - Outer product approximation
  • 5.4.3 - Inverse Hessian
  • 5.4.4 - Finite differences
  • 5.4.5 - Exact evaluation of the Hessian
  • 5.4.6 - Fast multiplication by the Hessian

5.5 - Regularization in Neural Networks

  • 5.5.1 - Consistent Gaussian priors
  • 5.5.2 - Early stopping
  • 5.5.3 - Invariances
  • 5.5.4 - Tangent propagation
  • 5.5.5 - Training with transformed data
  • 5.5.6 - Convolutional networks
  • 5.5.7 - Soft weight sharing

5.6 - Mixture Density Networks

5.7 - Bayesian Neural Networks

  • 5.7.1 - Posterior parameter distribution
  • 5.7.2 - Hyperparameter optimization
  • 5.7.3 - Bayesian neural networks for classification

Chapter 6 - Kernel Methods

6.1 - Dual Representations

6.2 - Constructing Kernels

6.3 - Radial Basis Function Networks

  • 6.3.1 - Nadaraya-Watson model

6.4 - Gaussian Processes

  • 6.4.1 - Linear regression revisited
  • 6.4.2 - Gaussian processes for regression
  • 6.4.3 - Learning the hyperparameters
  • 6.4.4 - Automatic relevance determination
  • 6.4.5 - Gaussian processes for classification
  • 6.4.6 - Laplace approximation
  • 6.4.7 - Connection to neural networks

Chapter 9 - Mixture Models and EM

9.1 - K-means Clustering

  • 9.1.1 - Image segmentation and compression

9.2 - Mixtures of Gaussians

  • 9.2.1 - Maximum likelihood
  • 9.2.2 - EM for Gaussian mixtures

9.3 - An Alternative View of EM

  • 9.3.1 - Gaussian mixtures revisited
  • 9.3.2 - Relation to K-means
  • 9.3.3 - Mixtures of Bernoulli distributions
  • 9.3.4 - EM for Bayesian linear regression

9.4 - The EM Algorithm in General

Releases

No releases published

Packages

No packages published