# Exercise - A Simple Example for the EM Algorithm

## Introduction

The EM algorihtm is used to find or adjust parameters in some model, which has some latent variables, by trying to maximze their likelihood given a certain data set and distribution function. In this notebook you will see a simple example and derive all the necessary steps, E-Step und M-Step to adjust the parameters of interest.

Remark: In order to detect errors in your own code, execute the notebook cells containing assert or assert_almost_equal. These statements raise exceptions, as long as the calculated result is not yet correct.

## Requirements

### Knowledge

To complete this exercise notebook, you should possess knowledge about the following topics:

• Maximum-Likelihood
• Notes by Christian Herta [HER18a]
• Expectation Maximization:
• Notes by Christian Herta [HER18b]
• Coursera Video [NOV18]

### Python Modules

import numpy as np

## Exercises

Given are two classes$z_i \in \{1,2\}$. When we observe a data point$x_i$, we do not know if it corresponds to class$1$ or$2$. But we know the value of$x_i$, which is either$a$,$b$, or$c$ and we know that$x_i$ must be class$2$ if its value is$c$ and it must be class$1$ if its value is$b$.

Formally, this can be described with:

• Observations$x_i \in \{a,b,c\}$
• Hidden states:$z_i \in \{1,2\}$
• Probability$p(z_i=1)=\alpha$
• Parameters:$\theta = \{\theta_0, \theta_1, \theta_3=\alpha$} -$i$: data index

and$P(x\mid z)$:

We now observe some data points$\bf x$:

• we observe$k$ times $a$"
• we observe$l$ times $b$"
• we observe$m$ times $c$"

Use the expectation maximization algorithm to find point estimates for the probability$P(z=1 \mid \bf x)$ and$\theta$.

1. Infere the equations for the E-Step and the M-Step (pen & paper)
2. Implement the corresponding functions
3. Choose concrete values for$k$,$l$,$m$ together with some starting values for$\theta$ and test your implementation

#### E-Step

The general E-Step is given by

$q(z_i) = p(z_i \mid x_i, \theta ).$

Compute the following quantities to get an expression for$q(z_i)$

\begin{align} q_a(z_i=1) = p(z_i=1 \mid x_i=a, \theta ) &\\ \end{align} $q_a(z_i=2) = p(z_i=2 \mid x_i=a, \theta ).$

## Solution

# Implement it in functions:

def e_step_q_a1(theta_0, theta_1, alpha):
raise NotImplementedError()
def e_step_q_a2(theta_0, theta_1, alpha):
raise NotImplementedError()

Also show that \begin{align} q_b(z_i=1) = p(z_i=1 \mid x_i=b, \theta ) &= 1 \end{align}

\begin{align} q_c(z_i=2) = p(z_i=2 \mid x_i=b, \theta ) &= 0 \end{align}

$q_c(z_i=1) = p(z_i=1 \mid x_i=c, \theta ) = 0$

\begin{align} q_c(z_i=2) = p(z_i=2 \mid x_i=c, \theta ) &= 1. \end{align}

#### M-Step

The general M-Step is given by

\begin{align} \theta^{(t+1)} &= \text{arg} \max_{\theta} L (q, \theta) \\ &=\text{arg} \max_{\theta} \sum_{i} \mathbb E_{q_i} \left[ \log {p({x}_i, {z_i} \mid \theta )} \right]\\ \end{align}.

Using this show that

$\theta_0 = \frac{1}{1+ \frac{l}{kq_a(z_i=1)}}$

$\theta_1 = \frac{1}{1+ \frac{m}{ k q_a(z_i=2) }} \\$

$\alpha= \frac{1}{1+\frac{kq_a(z_i=2)+m}{kq_a(z_i=1) + l}}.$

#### Hint

First find an expression for$p({x}_i=1, {z_i}=1 \mid \theta )$, which you will need in your calculation.

### Solution

def m_step_theta0(q_a1, k, l):
raise NotImplementedError()
def log_likelihood(theta_0, theta_1, alpha, k, l, m):
raise NotImplementedError()
# chose concrete values for your observation
k=4200
l=420
m=42

# chose some starting values
theta_0 = 0.424242
theta_1 = 0.4242
alpha = 0.42

# Run your implementation. In order to pass the tests below,
# update your theta and alpha values within the same variable.
# Or at least assign your final values to theta_0, theta_1 and alpha

# To test your implementation run this cell

n = k + l + m
n_1 = alpha*n
n_2 = n - n_1
k_ = n_1 * theta_0 + n_2 * theta_1
l_ = n_1 * (1-theta_0)
m_ = n_2 * (1-theta_1)

np.testing.assert_almost_equal(k_, k)
np.testing.assert_almost_equal(l_, l)
np.testing.assert_almost_equal(m_, m)

## Literature

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Exercise - A Simple Example for the EM Algorithm
by Christian Herta, Klaus Strohmenger