Exercise - Univariate Gaussian Likelihood

Introduction

Given a sample from a Gaussian distribution, we already know the equations to calculate the best estimate for the expected value μ^\hat\mu and the variance σ^N2\hat\sigma_N^2 (or σN12\sigma_{N-1}^2). In this notebook you will use the maximum likelihood estimator (MLE) to numerically find probabilities for the mean and the variance, given a sample.

Remark: In order to detect errors in your own code, execute the notebook cells containing assert or assert_almost_equal. These statements raise exceptions, as long as the calculated result is not yet correct.

Requirements

Knowledge

To complete this exercise notebook, you should possess knowledge about the following topics.

  • Univariate Gaussian
  • Maximum Likelihood

The following material can help you to acquire this knowledge:

  • Gaussian, variance, mean:
  • Chapter 3 of the Deep Learning Book [GOO16]
  • Chapter 1 of the book Pattern Recognition and Machine Learning by Christopher M. Bishop [BIS07]
  • Univariate gaussian:
  • Video1 and the follwoing of Khan Academy [KHA18a]
  • Sample variance:
  • Video2 and the follwoing of Khan Academy [KHA18b]
  • Read Chapter 24.1 of David MacKays Book[MAC03] (highly recommended, if you want to dive deeper!)

Python Modules

# External Modules
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt


%matplotlib inline

Exercises

Exercise - Maximum Likelihood

Advice: Read Chapter 24.1 of David MacKays Book[MAC03]

The equation to calculate σ^N2\hat \sigma_{N}^2 (but also μ^\hat\mu) can be derived with the maximum likelihood estimator for the Gaussian, though this will not be the task here. Instead, we will use it to visualize the most likely values for σ^N2\hat \sigma_{N}^2 and μ^\hat\mu in a contour plot.

Recap - Univariate Gaussian

The equation for the PDF of a Gaussian is:

P(xμ,σ)=12πσ2exp((xμ)22σ2)P(x\mid\mu,\sigma) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)

with:

  • the mean μ\mu
  • the standard deviation σ\sigma

Recap - Maximum Likelihood

The maximum likelihood estimator is a method which finds a point estimate for the parameters θ\theta for a known function p(xθ)p({\bf x}\mid\theta) given an observation x\bf x:

L(θ)=i=1Np(xiθ)L(\theta) = \prod_{i=1}^N p(x_i\mid\theta)

The parameters θ\theta, which maximize this function are most likely.

Task (pen & paper):

  1. Write down the likelihood function for the normal distribution.
  2. Show that the likelihood for the normal distribution can be written as:
L(μ,σ2)=i=1Np(xiμ,σ2)==(12πσ2)Nexp(N(μμ^)2+S2σ2)L(\mu, \sigma^2) = \prod_{i=1}^N p(x_i\mid\mu,\sigma^2) = \ldots = \left(\frac{1}{\sqrt{2\pi\sigma^2}}\right)^N \exp\left(-\frac{N(\mu- \hat\mu)^2 + S}{2\sigma^2}\right)

with:

  • the empirical mean μ^\hat\mu of the observation x\bf x
  • S=i=1N(xiμ^)2S = \sum_{i=1}^{N}(x_i - \hat\mu)^2

The purpose of this rewritten equation is, that it can be implemented way more efficient than the original equation from Task 1.

Hint:

For Task 2: Make use of the equation to calculate σ^N2=1NxiN(xiμ^)2\hat \sigma_{N}^2 = \frac{1}{N} \sum_{x_i}^N \left( x_i - \hat{\mu} \right)^2

Task:

Implement the function from the last exercise (2.) to calculate the likelihood for concrete values for μ\mu and σ\sigma given x\bf x. Use the predefined 2D arrays generated with np.meshgrid in order to avoid loops in your function.

mu = -1.5
sigma = 3
sigma_square = sigma**2
size = 10
def plot_gaussian_pdf(mu, sigma):
    x = np.linspace(mu - 4*sigma, mu + 4*sigma, 100)
    plt.plot(x,scipy.stats.norm.pdf(x, mu, sigma))

plot_gaussian_pdf(mu, sigma)
def get_data(mu, sigma_square, size):
    sigma = np.sqrt(sigma_square)
    x = np.random.normal(loc=mu, scale=sigma, size=size)
    return x
x = get_data(mu, sigma_square, size)
x
mean_ = x.mean()
sigma_ = np.sqrt(np.var(x,ddof=1))
xlist = np.linspace(mean_-1, mean_+1, 100)
ylist = np.linspace(sigma_-1, sigma_+1., 100)
X, Y = np.meshgrid(xlist, ylist)
def likelihood_gaussian(x, mu, sigma):
    """ Calculates the likelihood for univariate Gaussian
    
    :x: sample as 1D numpy-array (float)
    :mu: values for the mean as 2D numpy-array (float) with the shape (n,n)
        [[m1,m2,...,mn], [m1,m2,...,mn], ..., [m1,m2,...,mn]]
    :sigma: values for sigma as 2D numpy-array (float) with the shape (n,n)
        [[s1,s1,...,s1], [s2,s2,...,s2], ..., [sn,sn,...,sn]]
    
    :returns: probabilities as 2D numpy-array (float)
    """  
    raise NotImplementedError
Z = likelihood_gaussian(x, X, Y)

Plots

With the use of the function we can now plot:

  • The likelihood for the Gaussian for different μ\mu and σ\sigma, given x\bf x.
  • The posterior probability of μ\mu for different fixed values σ\sigma.
  • The posterior probability of σ\sigma for different fixed values μ\mu.

If your implementation likelihood_gaussian is correct, the plots should look similiar to these:

plt.figure()
cp = plt.contour(X, Y, Z)

plt.title('Likelihood')
plt.xlabel('mean')
plt.ylabel('sigma')
plt.show()

print("For comparison:")
print("calculated mean:\t", x.mean())
print("calculated sigma:\t", np.sqrt(x.var()))
n_values = 1000
mu = np.linspace(-5,5,n_values)
sigma_2 = likelihood_gaussian(x, mu, sigma=2)
sigma_2 = sigma_2 / sigma_2.sum()
sigma_2_5 = likelihood_gaussian(x, mu, sigma=2.5)
sigma_2_5 = sigma_2_5 / sigma_2_5.sum()
sigma_3 = likelihood_gaussian(x, mu, sigma=3.)
sigma_3 = sigma_3 / sigma_3.sum()
scaling_factor = n_values/10
plt.plot(mu, sigma_2*scaling_factor, 'b-', label="$\sigma=2$")
plt.plot(mu, sigma_2_5*scaling_factor, 'g-', label="$\sigma=2.5$")
plt.plot(mu, sigma_3*scaling_factor, 'r-', label="$\sigma=3$")
plt.xlabel('$\mu$')
plt.ylabel('$p(\mu\mid x,\sigma)$')
plt.legend()
sigma = np.linspace(0.1,7,n_values)
mu_m0_15 = likelihood_gaussian(x, mu=-1.0, sigma=sigma)
mu_m0_15 = mu_m0_15 / mu_m0_15.sum()
mu_0_4 = likelihood_gaussian(x, mu=0.0, sigma=sigma)
mu_0_4 = mu_0_4 / mu_0_4.sum()
mu_0_8 = likelihood_gaussian(x, mu=1.0, sigma=sigma)
mu_0_8 = mu_0_8 / mu_0_8.sum()
plt.plot(sigma, mu_m0_15*scaling_factor, 'b-', label="$\mu=-1.0$")
plt.plot(sigma, mu_0_4*scaling_factor, 'g-', label="$\mu=0.0$")
plt.plot(sigma, mu_0_8*scaling_factor, 'r-', label="$\mu=1.$")
plt.xlabel('$\sigma$')
plt.ylabel('$p(\sigma\mid x,\mu)$')
plt.legend()

Literature

Licenses

Notebook License (CC-BY-SA 4.0)

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Exercise - Multivariate Gaussian
by Christian Herta, Klaus Strohmenger
is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://gitlab.com/deep.TEACHING.

Code License (MIT)

The following license only applies to code cells of the notebook.

Copyright 2018 Christian Herta, Klaus Strohmenger

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.