# Neural Networks - Exercise: Simple MNIST Network

TODO

## Requirements

TODO

### Python-Modules

# third party
import numpy as np
import matplotlib.pyplot as plt

# internal
from deep_teaching_commons.data.fundamentals.mnist import Mnist

## Data

# create mnist loader from deep_teaching_commons

# load all data, labels are one-hot-encoded, images are flatten and pixel squashed between [0,1]
train_images, train_labels, test_images, test_labels = mnist_loader.get_all_data(one_hot_enc=True, normalized=True)

# shuffle training data
shuffle_index = np.random.permutation(60000)
train_images, train_labels = train_images[shuffle_index], train_labels[shuffle_index]

## Simple MNIST Network

The presented network is an adaptation of Michael Nielson's introductory example to neural networks. It is recommended, though not necessary, to read the first two chapters of his great online book 'Neural Networks and Deep Learning' for a better understanding of the given example. Compared to the original by Nielsen, the present variant was vectorized and the sigmoid activation function replaced by a rectified linear unit function (ReLU). As a result, the code is written much more compact, and the optimization of the model is much more efficient.

delta_hist =[]

def feed_forward(X, weights):
a = [X]
for w in weights:
a.append(np.maximum(a[-1].dot(w),0))
return a

a = feed_forward(X, weights)
# https://brilliant.org/wiki/backpropagation/ or https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-alongside-applications
delta = a[-1] - Y
delta_hist.append(np.sum(delta*Y)/len(X))
for i in range(len(a)-2, 0, -1):
delta = (a[i] > 0) * delta.dot(weights[i].T)

trX, trY, teX, teY = train_images, train_labels, test_images, test_labels
weights = [np.random.randn(*w) * 0.1 for w in [(784, 200), (200,100), (100, 10)]]
num_epochs, batch_size, learn_rate = 20, 50, 0.1
for i in range(num_epochs):
for j in range(0, len(trX), batch_size):
X, Y = trX[j:j+batch_size], trY[j:j+batch_size]
weights -= learn_rate * grads(X, Y, weights)
once = False
prediction_test = np.argmax(feed_forward(teX, weights)[-1], axis=1)
print (i, np.mean(prediction_test == np.argmax(teY, axis=1)))

## Exercise - Understanding an Implementation

Your goal is to understand how the implementation works. Therefore you can do the following:

• Plot delta_hist, which stores the delta value calculated on the output layer during each iteration
• Add a verbose argument (boolean) to the functions that adds meaningful print lines to the network.

Hopefully, this implementation of the given neural network is clear after your 1. Which cost function is used, what is its derivation and how is it implemented? 2. Why are the boundaries of your plot between [-1,0], why is it so noisy, how can you reduce the noise and what is the difference to a usual plot of a loss function? 3. How does the network implement the backpropagation algorithm?

## Exercise - Step towards a NN-Framework

The presented implementation is compact and efficient, but hard to modify or extend. However, a modular design is crucial if you want to experiment with a neural network to understand the influence of its components. Now you make the first changes towards your own 'toy-neural-network-framework', which you should expand in the progress of the course.

Rework the implementation from above given the classes and methods below. Again, you do not have to re-engineer the whole neural network at this step. Rework the code to match the given specification and do necessary modifications only. For your understanding, you can change the names of the variables to more fitting ones.

class FullyConnectedNetwork:
def __init__(self, layers):

def forward(self, data):

def backward(self, X, Y):

def predict(self, data):

class Optimizer:
def __init__(self, network, train_data, train_labels, test_data=None, test_labels=None, epochs=100, batch_size=20, learning_rate=0.01):

def sgd(self):

# Following code should run:
mnist_NN = FullyConnectedNetwork([(784, 200),(200,100),(100, 10)])
epochs, batch_size, learning_rate = 20, 500, 0.1
Optimizer(mnist_NN, train_images, train_labels, test_images, test_labels, epochs, batch_size, learning_rate)
plt.plot(mnist_NN.delta_hist)

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Neural Networks - Exercise: Simple MNIST Network
by Benjamin Voigt