# Introduction

- Sequence prediction

- Sequence generation

- Sequence recognition,

- and Sequential decision making

This introduction course to sequence learning is divided into two parts:

- Sequence learning based on probabilistical graphical models, and

- sequence learning with neural networks.

# The sequence modeling problems

[@murphy2013machine]

- Filtering

- Smoothing

- Predicion

- MAP Estimation

- Posterior samples

- Probability of the evidence

## Independent and identically distributed data

In many machine learning problems the underlying data are independent and identically distributed. *Independent* means that the outcome of one observation $x_i$ does not effect the outcome of another observation $x_j$ for $i\neq j$. The term *identically distributed* is used for if all $x_i$’s are drawn from the same probability distribution.

In sequence learning the data in a sequence are not *independent*. The data ${bf x}_t$ at time $t$ depends on all previous times,

For a sequence ${\bf x_1, x_2, \dots x_t}$ the probability distribution can be factorized according to

$p({\bf x}_1, {\bf x}_2, \dots {\bf x}_t) = p({\bf x}_1) p({\bf x}_2 \mid {\bf x}_1) p({\bf x}_3, \dots {\bf x}_t\mid {\bf x}_1, {\bf x}_2) \dots p({\bf x}_t \mid {\bf x}_1,{\bf x}_2, \dots {\bf x}_{t-1})$

time index $t$

Here we

# Probabilistic graphical models for sequence learning

## Markov Models: Example Bi-Gram-language-models

Markov Models: Example Bi-Gram-language-models

## Hidden Markov Models

Tutorial [@Rabiner89atutorial]

## Maximum Entropy Markov Models

## Linear Chain Conditional Random Fields

Linear Chain Conditional Random Fields

# Neural Seuquence Models

Coming Soon!

Most of the material were developed in the project Deep.Teaching. The project Deep.Teaching is funded by the BMBF, project number 01IS17056.