Revision 84c00db429f7d777d256a73489eb9606314de5e3 (click the page title to view the current version)

CNN

Changes from 84c00db429f7d777d256a73489eb9606314de5e3 to c75c297c03c83f9035f1800db1f9586c925314f6

---
title: Convolutional Neural Networks Lecture
categories: lecture
---

# Briefing

## What is a newural network

+ The single Neuron
    + Weighted Input
    + Activation
+ The network model
    + Input/Output
    + Weights
    + Activation Function
+ The Tensor Model

## Output and Loss Function

+ Classification versus Regression

**MSE**

$$L = (x-y)^2$$

**CrossEntropy**

$$L = \log \frac{ \exp x_{y} } { \sum \exp x_i }$$

## Training

+ Optimisation problem
    + tune the weights to minimise the loss function
    + if the activation function is differentiable, the entire system is
    + different optimisation algorithms;
      trust the API or do a more advanced module

## Activation Functions

+ Threshold functions
+ Approximations to the threshold function
+ Logistic: $f(x) = \frac1{1+e^{-\beta x}}$
+ ReLU: $f(x)=\max(x,0)$
    - not differentiable

## Tools

Two main contenders.

+ TensorFlow
+ PyTorch
    + A replacement for NumPy to use the power of GPUs and other accelerators.
    + An automatic differentiation library that is useful to implement neural networks.

Note that PyTorch replaces NumPy; i.e. it is primarily a python tool, 
and operaes in the object oriented framework of python.

The reason for using PyTorch in these examples is primarily that I have
lately been working off some code created by some final year students 
this Spring, and they happened to choose PyTorch.
The choice of TensorFlow or PyTorch is otherwise arbitrary.

## Sample Program

### Training

```python
model = Inception3(num_outputs=nparams)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),

model.train()
for epoch in range(num_epochs):
    tloss = 0.0
    for i, (images, params) in enumerate(trainloader):
        optimizer.zero_grad()
## What is a Convolutional Network

        output = model(images)
        loss = criterion(output, params)
        loss.backward()
        optimizer.step()
+ Convolutional Layers
    + $3\times3$ or $5\times5$ (possibly $7\times7$)
+ Pooling
    + Often maximum; sometimes average or other functions
+ Convolution-Detector-Pooling = Convolutional Unit
    + Convolution
    + Activation (Detector)
    + Pooling

        tloss += loss.item() * len(images)
Why is convolution suitable for images?

    print( f"Epoch {epoch+1}: Loss = {tloss}" )
## Features of CNN

```
+ Capacity (degrees of freedom)
    + fewer weights (DOF) per layer
+ Representation learning
    - may learn edge or corner detection for instance
+ Location invariance
+ Hierarchies; serial and parallel modularisation
+ Fully connected layer at the end.
    - convolutional units learn features
    - last layer uses the features like a traditional ANN with one hidden layer

### Testing
## Considerations in CNN

```python
total_loss = 0
model.eval()
with torch.no_grad():
    for images, params in testloader:
        output = model(images)
        loss = criterion(output, params)
        total_loss += loss * len(images)
```
+ Padding (edge conditions for convolution)
+ Batch Normalisation
    + mini batches
+ Filter/Kernel

## Loss Functions and Evaluation
## Other important features (any ANN)

## Some practical issues
+ Normalisation
+ Over- and under-training
    + Plot over epochs
+ Feature detection