# HTW Berlin - Angewandte Informatik - Advanced Topics - Exercise - Kullback–Leibler Divergence

## Introduction

In this notebook you will use the KL Divergence to measure the quality of an approximation of a probability density function.

In order to detect errors in your own code, execute the notebook cells containing assert or assert_almost_equal. These statements raise exceptions, as long as the calculated result is not yet correct.

## Requirements

### Knowledge

To complete this exercise notebook you should possess knowledge about the following topics.

• Proability mass function (pmf)
• Proability density function (pdf)
• Entropy
• KL Divergence

### Python Modules

# External Modules
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as ss
from numpy.testing import assert_almost_equal

%matplotlib inline

## Exercise

Given are three probability density functions $p(x), q(x), pp(x)$ with support $[0,1[$.

np.random.seed(42)

a = np.random.rand(10)
b = np.random.rand(15)
ag = ss.gaussian_kde(a)
bg = ss.gaussian_kde(b)
k = np.linspace(0, 1, 10)
p = ag(k) # pdf
q = bg(k) # pdf
plt.plot(k, p)
plt.title("Probability Density Function p")
plt.xlabel("x")
plt.ylabel("p(x)")
plt.show()
plt.plot(k, q)
plt.title("Probability Density Function q")
plt.xlabel("x")
plt.ylabel("q(x)")
plt.show()
pp = np.ones_like(p) * 2.
pp[k>0.5] = 0. # pdf
plt.plot(k, pp)
plt.title("Probability Density Function pp")
plt.xlabel("x")
plt.ylabel("pp(x)")
plt.show()

Write a function for computing the KL Divergence for such a pdf (probability density function) in pure numpy. In order to successfully pass the tests, use the natural logarithm.

Reminder:

For probability mass functions / for discrete values:

$D_{KL}(Q \mid \mid P) = \sum_{x \in \mathcal A x} Q(x) \log \frac{Q(x)}{P(x)}$

For probability density functions / for continous values:

$D_{KL}(Q \mid \mid P) = \int_{-\infty}^{+\infty} q(x) \log \frac{q(x)}{p(x)} dx$

Hint:

KL Divergence might not be defined for all elements in $p(x), q(x), pp(x)$. Find a workaround so your function still returns a useful value.

# Implement this function

def kl_div(q, p):
"""Calculates the KL Divergence D(q||p).

:param q: values for the function q (true function)
:type q: ndarray containing n values of type float
:param p: values for the function p (approximation of q)
:type p: ndarray containing n values of type float

:returns: KL Divergence D(q||p) or np.infty if not possible
:rtype: float
"""

raise NotImplementedError()
# Executing this cell must not throw an AssertionError

assert_almost_equal(kl_div(q, p), ss.entropy(q, p))
assert_almost_equal(kl_div(pp, p), ss.entropy(pp, p))
assert_almost_equal(kl_div(pp, q), ss.entropy(pp, q))
assert_almost_equal(kl_div(p, pp), np.infty)

1. Explain why it's possible to calculate kl_div(pp, p), but not kl_div(p, pp).

2. Which value is larger kl_div(pp, p) or kl_div(pp, q)? What do you expect? Why?

## Literature

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

HTW Berlin - Angewandte Informatik - Advanced Topics - Exercise - Entropy
by Christian Herta, Klaus Strohmenger