Introduction

  • Sequence prediction
  • Sequence generation
  • Sequence recognition,
  • and Sequential decision making

This introduction course to sequence learning is divided into two parts:

  • Sequence learning based on probabilistical graphical models, and
  • sequence learning with neural networks.

The sequence modeling problems

 [@murphy2013machine]

  • Filtering
  • Smoothing
  • Predicion
  • MAP Estimation
  • Posterior samples
  • Probability of the evidence

Independent and identically distributed data

In many machine learning problems the underlying data are independent and identically distributed. Independent means that the outcome of one observation $x_i$ does not effect the outcome of another observation $x_j$ for $i\neq j$. The term identically distributed is used for if all $x_i$’s are drawn from the same probability distribution.

In sequence learning the data in a sequence are not independent. The data ${bf x}_t$ at time $t$ depends on all previous times,

For a sequence ${\bf x_1, x_2, \dots x_t}$ the probability distribution can be factorized according to

$p({\bf x}_1, {\bf x}_2, \dots {\bf x}_t) = p({\bf x}_1) p({\bf x}_2 \mid {\bf x}_1) p({\bf x}_3, \dots {\bf x}_t\mid {\bf x}_1, {\bf x}_2) \dots p({\bf x}_t \mid {\bf x}_1,{\bf x}_2, \dots {\bf x}_{t-1})$

time index $t$

Here we

Probabilistic graphical models for sequence learning

Markov Models: Example Bi-Gram-language-models

Markov Models: Example Bi-Gram-language-models

Hidden Markov Models

Tutorial [@Rabiner89atutorial]

Hidden Markov Models

Maximum Entropy Markov Models

Maximum Entropy Markov Models

Linear Chain Conditional Random Fields

Linear Chain Conditional Random Fields

Neural Seuquence Models

Coming Soon!

Most of the material were developed in the project Deep.Teaching. The project Deep.Teaching is funded by the BMBF, project number 01IS17056.