# Status: Draft

THIS COURSE IS STILL IN PREPARATION

# Introduction

• Sequence prediction
• Sequence generation
• Sequence recognition,
• and Sequential decision making

This introduction course to sequence learning is divided into two parts:

• Sequence learning based on probabilistical graphical models, and
• sequence learning with neural networks.

# The sequence modeling problems

[@murphy2013machine]

• Filtering
• Smoothing
• Predicion
• MAP Estimation
• Posterior samples
• Probability of the evidence

## Independent and identically distributed data

In many machine learning problems the underlying data are independent and identically distributed. Independent means that the outcome of one observation $x_i$ does not effect the outcome of another observation $x_j$ for $i\neq j$. The term identically distributed is used for if all $x_i$’s are drawn from the same probability distribution.

In sequence learning the data in a sequence are not independent. The data ${bf x}_t$ at time $t$ depends on all previous times,

For a sequence ${\bf x_1, x_2, \dots x_t}$ the probability distribution can be factorized according to

$p({\bf x}_1, {\bf x}_2, \dots {\bf x}_t) = p({\bf x}_1) p({\bf x}_2 \mid {\bf x}_1) p({\bf x}_3, \dots {\bf x}_t\mid {\bf x}_1, {\bf x}_2) \dots p({\bf x}_t \mid {\bf x}_1,{\bf x}_2, \dots {\bf x}_{t-1})$

time index $t$

Here we

# Probabilistic graphical models for sequence learning

## Markov Models: Example Bi-Gram-language-models

Markov Models: Example Bi-Gram-language-models

## Hidden Markov Models

Tutorial [@Rabiner89atutorial]

Hidden Markov Models

## Maximum Entropy Markov Models

Maximum Entropy Markov Models

## Linear Chain Conditional Random Fields

Linear Chain Conditional Random Fields

# Neural Seuquence Models

Recurrent Neural Networks

Coming Soon!

Most of the material were developed in the project Deep.Teaching. The project Deep.Teaching is funded by the BMBF, project number 01IS17056.