An implementation of the Variational Autoencoder based on Auto-Encoding Variational Bayes (Kingma and Welling 2013.).
VAE was trained and performed inference on binarized MNIST (handwritten digits) dataset.
For an i.i.d. dataset with continuous latent variables per data point, the Variational Bayes algorithm optimizes a recognition network (encoder) that performs approximate posterior inference using ancestral sampling.
Latent variable z is sampled from prior distribution on z usuing true parameters theta*.
Likelihood of data x (i.e. all 784 pixels of image) is from a conditional distribution on z using true parameters theta*. Here the distribution is a product of independent Bernoulli's whose means are outputted by the generator network (decoder) parameterized by theta.
Using an closed form expression for two Gaussians.
Like a regularization term, calculates the expected log ratio of approximate posterior from prior.
Negative log likelihood of datapoints.
Post training, let's explore the properties of our trained approximate posterior.
Sampling latent from prior then using the decoder (generative model) to compute the bernoulli means over the pixels of image given latent z:
Inferred latent posterior means from the encoder (recognition model):
Generated samples from latent representations interpolated between the posterior means of two different training examples:
Isocountours from joint distribution over latent z (note: 2D in this case, but can be higher for more expressivity) and trained top half of image x:
Sample a latent z, feed into our probabilistic decoder, and infer the Bernouilli means of all bottom half of the image's pixels: