Exercise - A Simple Example for the EM Algorithm

Introduction

[TODO]

Remark: In order to detect errors in your own code, execute the notebook cells containing assert or assert_almost_equal. These statements raise exceptions, as long as the calculated result is not yet correct.

Requirements

Knowledge

To complete this exercise notebook, you should possess knowledge about the following topics:

• Maximum-Likelihood
• Notes by Christian Herta [HER18a]
• Expectation Maximization:
• Notes by Christian Herta [HER18b]
• Coursera Video [NOV18]

Python Modules

import numpy as np

Exercises

Given are two classes $z_i \in \{1,2\}$. When we observe a data point $x_i$, we do not know if it corresponds to class $1$ or $2$. But we know the value of $x_i$, which is either $a$, $b$, or $c$ and we know that $x_i$ must be class $2$ if its value is $c$ and it must be class $1$ if its value is $b$.

Formally, this can be described with:

• Observations $x_i \in \{a,b,c\}$
• Hidden states: $z_i \in \{1,2\}$
• Probability $p(z_i=1)=\alpha$
• Parameters: $\theta = \{\theta_0, \theta_1, \theta_3=\alpha$}
• $i$: data index

and $P(z\mid x)$:

We now observe some data points $\bf x$:

• we observe $k$ times "$a$"
• we observe $l$ times "$b$"
• we observe $m$ times "$c$"

Use the expectation maximization algorithm to find point estimates for the probability $P(z=1 \mid \bf x)$ and $\theta$.

1. Infere the equations for the E-Step and the M-Step (pen & paper)
2. Implement the corresponding functions
3. Choose concrete values for $k$, $l$, $m$ together with some starting values for $\theta$ and test your implementation

E-Step

$q(z_i) = p(z_i \mid x_i, \theta )$

(...)

Compute the folowing quantities.

q_a(z_i=1) = p(z_i=1 \mid x_i=a, \theta ) &= ?\\ \end{align} q_a(z_i=2) = p(z_i=2 \mid x_i=a, \theta ) = ?  # Implement it in functions: def e_step_q_a1(theta_0, theta_1, alpha): raise NotImplementedError() def e_step_q_a2(theta_0, theta_1, alpha): raise NotImplementedError() Show that: q_b(z_i=1) = p(z_i=1 \mid x_i=b, \theta ) &= 1 \end{align}

q_c(z_i=2) = p(z_i=2 \mid x_i=b, \theta ) &= 0 \end{align}$q_c(z_i=1) = p(z_i=1 \mid x_i=c, \theta ) = 0$ q_c(z_i=2) = p(z_i=2 \mid x_i=c, \theta ) &= 1 \end{align}

M-Step

From

\begin{aligned} \theta^{(t+1)} &= \text{arg} \max_{\theta} L (q, \theta) \\ &=\text{arg} \max_{\theta} \sum_{i} \mathbb E_{q_i} \left[ \log {p({x}_i, {z_i} \mid \theta )} \right]\\ \end{aligned}

Show that:

$\theta_0 = \frac{1}{1+ \frac{l}{kq_a(z_i=1)}}$
\theta_1 = \frac{1}{1+ \frac{m}{ k q_a(z_i=2) }} \\
$\alpha= \frac{1}{1+\frac{kq_a(z_i=2)+m}{kq_a(z_i=1) + l}}$
def m_step_theta0(q_a1, k, l):
raise NotImplementedError()
def log_likelihood(theta_0, theta_1, alpha, k, l, m):
raise NotImplementedError()
# chose concrete values for your observation
k=4200
l=420
m=42

# chose some starting values
theta_0 = 0.424242
theta_1 = 0.4242
alpha = 0.42

# Run your implementation. In order to pass the tests below,
# update your theta and alpha values within the same variable.
# Or at least assign your final values to theta_0, theta_1 and alpha

# To test your implementation run this cell

n = k + l + m
n_1 = alpha*n
n_2 = n - n_1
k_ = n_1 * theta_0 + n_2 * theta_1
l_ = n_1 * (1-theta_0)
m_ = n_2 * (1-theta_1)

np.testing.assert_almost_equal(k_, k)
np.testing.assert_almost_equal(l_, l)
np.testing.assert_almost_equal(m_, m)

Literature

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Exercise - A Simple Example for the EM Algorithm
by Christian Herta, Klaus Strohmenger