# Exercise - Logistic Regression with PyTorch

## Introduction

Teaching objectives of this notebook are:

• Implementing a logistic regression model using PyTorch

In order to detect errors in your own code, execute the notebook cells containing assert or assert_almost_equal.

## Requirements

### Knowledge

• Logistic regression
• Chapter 5 and 6 of the Deep Learning Book
• Chapter 5 of the book Pattern Recognition and Machine Learning by Christopher M. Bishop [BIS07]
• Video 15.3 and following in the playlist Machine Learning

### Python Modules

import numpy as np

import scipy.stats
from scipy.stats import norm

from matplotlib import pyplot as plt
from IPython.core.pylabtools import figsize

%matplotlib inline
import torch
import torch.nn as nn
import torch.optim as optim

torch.manual_seed(1)

## Training Data

-$m$-Training data$\mathcal D = \{(\vec x^{(1)}, y^{(1)}),(\vec x^{(2)},y^{(2)}), \dots (\vec x^{(m)},y^{(m)})\}$

here with

• two features$\vec x = (x_1, x_2)^T$
• two classes:$y \in \{ 0, 1\}$
# class 0:
# covariance matrix and mean
cov0 = np.array([[5,-4],[-4,4]])
mean0 = np.array([2.,3])
# number of data points
m0 = 100

# class 1
# covariance matrix
cov1 = np.array([[5,-3],[-3,3]])
mean1 = np.array([0.5,0.5])
m1 = 100

# generate m0 gaussian distributed data points with
# mean0 and cov0.
r0 = np.random.multivariate_normal(mean0, cov0, m0)
r1 = np.random.multivariate_normal(mean1, cov1, m1)

def plot_data(r0, r1):
plt.figure(figsize=(7.,7.))
plt.scatter(r0[...,0], r0[...,1], c='r', marker='o', label="class 0")
plt.scatter(r1[...,0], r1[...,1], c='b', marker='o', label="class 1")
plt.xlabel("$x_1$")
plt.ylabel("$x_2$")
plot_data(r0, r1)
plt.legend()
X = np.concatenate((r0, r1), axis=0)
X.shape
y = np.concatenate((np.zeros(m0), np.ones(m1)))
y.shape
# shuffle the data
assert X.shape == y.shape
perm = np.random.permutation(np.arange(X.shape))
X = X[perm]
y = y[perm]

## Exercise

Since we have concrete classes and not contiunous values, we have to implement logistic regression (opposed to linear regression). Logistic regression implies the use of the logistic function:

$\sigma(z) = \frac{1}{1+ exp(-z)}$

with

$z = \vec \theta^T \vec x = \theta_0 \cdot x_0 + \theta_1 \cdot x_1 + \ldots + \theta_n \cdot x_n$

and feature$x_0 = 1$ for every training example.

Or visually as graph: ### Implement the Model

Implement logistic regression. This can be split into three subtasks: 1. Implement the LogisticRegression class. 2. Implement the computation of the cross-entropy loss. 3. Implement vanilla gradient descent.

Sidenote:

The logistic function, as well as the tanh, are so called sigmoid functions, because of their "S" shape. Though in the domain of machine learning, the term "sigmoid" function ofter refers to the logistic function.

### Logistic Model

Implement Logistic Regression as an nn.Module. If you have done the notebook about linear regression before, you should already be familiar with torch.nn.Linear. Just pipe its output with torch.nn.Sigmoid.

Again. Add torch.nn.Linear and torch.nn.Sigmoid as class members and use them in the forward method.

If you do not want to use PyTorchs built-in functions, you can of course implement the sigmoid function yourself ;-)

Hint:

In our case, with two features, the input data has the shape (m_examples, n_features):

    tensor([[-0.6617, -0.0426],
[-1.3328,  0.5161],
....


The forward method should return the probabilities for the positive class$p(y=1 | x, \theta)$

    tensor([[ 0.7577],
[ 0.0777],
....

class LogisticRegression(nn.Module):  # inheriting from nn.Module!

def __init__(self, num_features):

super(LogisticRegression, self).__init__()

###############################
###############################

raise NotImplementedError()

###############################
###############################

def forward(self, x):
###############################
###############################
# should return the probabilities for the classes, e.g.
# tensor([[ 0.7577],
#         [ 0.0777],
#         ....

raise NotImplementedError()

###############################
###############################        
NUM_FEATURES = 2
model = LogisticRegression(NUM_FEATURES)
### Should output something like:
###
### LogisticRegression(
###   (linear): Linear(in_features=2, out_features=1, bias=True)
###   (sigmoid): Sigmoid()
### )
print(model)
### Iterate through our trainable parameters
for param in model.parameters():
print (param)

### Binary Cross-Entropy

Implement the computation of the binary cross-entropy loss. Don't use any build-in function of PyTorch for the cross-entropy.

Reminder:

\begin{equation} J_D(\theta)=\frac{1}{m}\sum_{i=1}^{m}\left(-y^i \cdot log(h_\theta (x^i)) - (1-y^i) \cdot log(1-h_\theta(x^i))\right) \end{equation}

# method that returns the crossentropy computed with pytorch
def binary_cross_entropy(predictions, targets):

###############################
###############################
#
# Task: crossentropy average as pytorch tensor (scalar)

raise NotImplementedError()

###############################
###############################

model = LogisticRegression(NUM_FEATURES)

### predict some dummy data
preds = model(torch.randn(4,2))

### dummy y
targets = torch.tensor([[0.],[0.],[0.],[1.]])

### calculate costs
costs = binary_cross_entropy(preds, targets)
print(costs)

# costs should be a float >= 0.0
assert costs >= 0.0

Train the model with gradient descent.

• Convert the data to torch tensors.
• Implement the gradient descent update rule.
• Apply iteratively the update rule to minimize the loss.

• Hint: Print the costs every ~100 epochs to get instant feedback about the training success

Reminder:

Equation for the update rule:

\begin{align} \theta_j' & = \theta_j - \alpha \cdot \frac{\partial}{\partial \theta_j} J(\theta)\\\\ \end{align}

###############################
###############################
#
# Task: Convert numpy arrays to tensors
#

###############################
###############################
### If your implementation is correct, these tests should not throw and exception

print(X_tensor.shape) ### should be [200,2]
print(y_tensor.shape) ### should be [200,1]

assert X_tensor.shape == 200
assert X_tensor.shape == 2
assert y_tensor.shape == 200
assert X_tensor.dtype == torch.float32
assert y_tensor.dtype == torch.float32
def update_step(model, loss_function, x_, y_, lr):

###############################
###############################

raise NotImplementedError()

###############################
###############################
def gradient_descent(data, targets, loss_function, model, lr = 0.5, nb_epochs = 1000):

###############################
###############################

raise NotImplementedError()

###############################
###############################
nb_epochs = 1000

# new model with untouched parameters
model = LogisticRegression(NUM_FEATURES)

# cost is a numpy array with the cost function value at each iteration.
# will be used below to print the progress during learning
cost = gradient_descent(X_tensor, y_tensor, loss_function=binary_cross_entropy, model=model, lr = 0.1, nb_epochs = nb_epochs)

### Cost-(Loss)-over-Iterations

Plot the costs per epoch. Just execute the cells. The output should look similar to the following:

plt.plot(range(nb_epochs), cost)
plt.xlabel('# of iterations')
plt.ylabel('cost')
plt.title('Learning Progress')

### Decision-Boundary-After-Training

Plot the data with the decisions. Just execute the cells. The output should look similar to the following: def plot_decision_boundary(model):
plot_data(r0, r1)
keys = list(model.state_dict().keys())
param1 = model.state_dict()[keys].detach().numpy()
param2 = model.state_dict()[keys].detach().numpy()
x1 = np.linspace(X.min()-1, X.max()+1, 10)
x2 = ( - param2 - param1[0,0] * x1 ) / param1[0,1]
plt.plot(x1, x2, 'g', label="decision boundary")
plt.legend()
plot_decision_boundary(model)

### Using PyTorch Built-Ins

Now create a new model with untrained parameters and this time use PyTorchs built-ins:

• torch.nn.BCELoss for the costs function.
• torch.optim.SGD, optim.Adam or any other optimizer to update your model.
###############################
###############################
#
# Task: Create a new model and train with built-in cost and optimizer

###############################
###############################
### your latest model you just trained should be named "model"
plot_decision_boundary(model)

## Literature

The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).

Exercise - Logistic Regression with PyTorch
by Christian Herta, Klaus Strohmenger