Exercise - Univariate Gaussian Likelihood

Introduction

Given a sample from a Gaussian distribution, we already know the equations to calculate the best estimate for the expected value $\hat\mu$ and the variance $\hat\sigma_N^2$ (or $\sigma_{N-1}^2$). In this notebook you will use the maximum likelihood estimator (MLE) to numerically find probabilities for the mean and the variance, given a sample.

Remark: In order to detect errors in your own code, execute the notebook cells containing assert or assert_almost_equal. These statements raise exceptions, as long as the calculated result is not yet correct.

Requirements

Knowledge

To complete this exercise notebook, you should possess knowledge about the following topics.

• Univariate Gaussian
• Maximum Likelihood

The following material can help you to acquire this knowledge:

• Gaussian, variance, mean:
• Chapter 3 of the Deep Learning Book [GOO16]
• Chapter 1 of the book Pattern Recognition and Machine Learning by Christopher M. Bishop [BIS07]
• Univariate gaussian:
• Video1 and the follwoing of Khan Academy [KHA18a]
• Sample variance:
• Video2 and the follwoing of Khan Academy [KHA18b]
• Read Chapter 24.1 of David MacKays Book[MAC03] (highly recommended, if you want to dive deeper!)

Python Modules

# External Modules
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

%matplotlib inline

Exercises

Exercise - Maximum Likelihood

Advice: Read Chapter 24.1 of David MacKays Book[MAC03]

The equation to calculate $\hat \sigma_{N}^2$ (but also $\hat\mu$) can be derived with the maximum likelihood estimator for the Gaussian, though this will not be the task here. Instead, we will use it to visualize the most likely values for $\hat \sigma_{N}^2$ and $\hat\mu$ in a contour plot.

Recap - Univariate Gaussian

The equation for the PDF of a Gaussian is:

$P(x\mid\mu,\sigma) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$

with:

• the mean $\mu$
• the standard deviation $\sigma$

Recap - Maximum Likelihood

The maximum likelihood estimator is a method which finds a point estimate for the parameters $\theta$ for a known function $p({\bf x}\mid\theta)$ given an observation $\bf x$:

$L(\theta) = \prod_{i=1}^N p(x_i\mid\theta)$

The parameters $\theta$, which maximize this function are most likely.

Task (pen & paper):

1. Write down the likelihood function for the normal distribution.
2. Show that the likelihood for the normal distribution can be written as:
$L(\mu, \sigma^2) = \prod_{i=1}^N p(x_i\mid\mu,\sigma^2) = \ldots = \left(\frac{1}{\sqrt{2\pi\sigma^2}}\right)^N \exp\left(-\frac{N(\mu- \hat\mu)^2 + S}{2\sigma^2}\right)$

with:

• the empirical mean $\hat\mu$ of the observation $\bf x$
• $S = \sum_{i=1}^{N}(x_i - \hat\mu)^2$

The purpose of this rewritten equation is, that it can be implemented way more efficient than the original equation from Task 1.

Hint:

For Task 2: Make use of the equation to calculate $\hat \sigma_{N}^2 = \frac{1}{N} \sum_{x_i}^N \left( x_i - \hat{\mu} \right)^2$

Implement the function from the last exercise (2.) to calculate the likelihood for concrete values for $\mu$ and $\sigma$ given $\bf x$. Use the predefined 2D arrays generated with np.meshgrid in order to avoid loops in your function.

mu = -1.5
sigma = 3
sigma_square = sigma**2
size = 10
def plot_gaussian_pdf(mu, sigma):
x = np.linspace(mu - 4*sigma, mu + 4*sigma, 100)
plt.plot(x,scipy.stats.norm.pdf(x, mu, sigma))

plot_gaussian_pdf(mu, sigma)
def get_data(mu, sigma_square, size):
sigma = np.sqrt(sigma_square)
x = np.random.normal(loc=mu, scale=sigma, size=size)
return x
x = get_data(mu, sigma_square, size)
x
mean_ = x.mean()
sigma_ = np.sqrt(np.var(x,ddof=1))
xlist = np.linspace(mean_-1, mean_+1, 100)
ylist = np.linspace(sigma_-1, sigma_+1., 100)
X, Y = np.meshgrid(xlist, ylist)
def likelihood_gaussian(x, mu, sigma):
""" Calculates the likelihood for univariate Gaussian

:x: sample as 1D numpy-array (float)
:mu: values for the mean as 2D numpy-array (float) with the shape (n,n)
[[m1,m2,...,mn], [m1,m2,...,mn], ..., [m1,m2,...,mn]]
:sigma: values for sigma as 2D numpy-array (float) with the shape (n,n)
[[s1,s1,...,s1], [s2,s2,...,s2], ..., [sn,sn,...,sn]]

:returns: probabilities as 2D numpy-array (float)
"""
raise NotImplementedError
Z = likelihood_gaussian(x, X, Y)

Plots

With the use of the function we can now plot:

• The likelihood for the Gaussian for different $\mu$ and $\sigma$, given $\bf x$.
• The posterior probability of $\mu$ for different fixed values $\sigma$.
• The posterior probability of $\sigma$ for different fixed values $\mu$.

If your implementation likelihood_gaussian is correct, the plots should look similiar to these:

plt.figure()
cp = plt.contour(X, Y, Z)

plt.title('Likelihood')
plt.xlabel('mean')
plt.ylabel('sigma')
plt.show()

print("For comparison:")
print("calculated mean:\t", x.mean())
print("calculated sigma:\t", np.sqrt(x.var()))
n_values = 1000
mu = np.linspace(-5,5,n_values)
sigma_2 = likelihood_gaussian(x, mu, sigma=2)
sigma_2 = sigma_2 / sigma_2.sum()
sigma_2_5 = likelihood_gaussian(x, mu, sigma=2.5)
sigma_2_5 = sigma_2_5 / sigma_2_5.sum()
sigma_3 = likelihood_gaussian(x, mu, sigma=3.)
sigma_3 = sigma_3 / sigma_3.sum()
scaling_factor = n_values/10
plt.plot(mu, sigma_2*scaling_factor, 'b-', label="$\sigma=2$")
plt.plot(mu, sigma_2_5*scaling_factor, 'g-', label="$\sigma=2.5$")
plt.plot(mu, sigma_3*scaling_factor, 'r-', label="$\sigma=3$")
plt.xlabel('$\mu$')
plt.ylabel('$p(\mu\mid x,\sigma)$')
plt.legend()
sigma = np.linspace(0.1,7,n_values)
mu_m0_15 = likelihood_gaussian(x, mu=-1.0, sigma=sigma)
mu_m0_15 = mu_m0_15 / mu_m0_15.sum()
mu_0_4 = likelihood_gaussian(x, mu=0.0, sigma=sigma)
mu_0_4 = mu_0_4 / mu_0_4.sum()
mu_0_8 = likelihood_gaussian(x, mu=1.0, sigma=sigma)
mu_0_8 = mu_0_8 / mu_0_8.sum()
plt.plot(sigma, mu_m0_15*scaling_factor, 'b-', label="$\mu=-1.0$")
plt.plot(sigma, mu_0_4*scaling_factor, 'g-', label="$\mu=0.0$")
plt.plot(sigma, mu_0_8*scaling_factor, 'r-', label="$\mu=1.$")
plt.xlabel('$\sigma$')
plt.ylabel('$p(\sigma\mid x,\mu)$')
plt.legend()

Literature

Notebook License (CC-BY-SA 4.0)

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Exercise - Multivariate Gaussian
by Christian Herta, Klaus Strohmenger