Neural Networks Lecture

Briefing

Briefing

What is a neural network

The single Neuron
- Weighted Input
- Activation
The network model
- Input/Output
- Weights
- Activation Function
The Tensor Model

Output and Loss Function

Classification versus Regression
- Two-class classification (0 or 1)
- Regression \(y=f(x)\in \mathbb{R}\)
- Multiclass \(y=(y_1,y_2,\ldots,y_n)\)
  - \(y_i=1\) means membership in class 1
Soft-decision: \(y\) is a continuous variable
- higher values are more probably one
Loss functiion
- Mean-squared error (common for regression) \[L = (x-y)^2\]
- Cross Entropy (common for classification) \[L = \log \frac{ \exp x_{y} } { \sum \exp x_i }\]
- There are others

Training

Optimisation problem
- tune the weights to minimise the loss function
- if the activation function is differentiable, the entire system is
- different optimisation algorithms; trust the API or do a more advanced module

Activation Functions

Threshold functions
Approximations to the threshold function
Logistic: \(f(x) = \frac1{1+e^{-\beta x}}\)
ReLU: \(f(x)=\max(x,0)\)
- not differentiable

Tools

Two main contenders.

TensorFlow
PyTorch
- A replacement for NumPy to use the power of GPUs and other accelerators.
- An automatic differentiation library that is useful to implement neural networks.

Note that PyTorch replaces NumPy; i.e. it is primarily a python tool, and operaes in the object oriented framework of python.

The reason for using PyTorch in these examples is primarily that I have lately been working off some code created by some final year students this Spring, and they happened to choose PyTorch. The choice of TensorFlow or PyTorch is otherwise arbitrary.

Sample Program

Training

model = Inception3(num_outputs=nparams)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),

model.train()
for epoch in range(num_epochs):
    tloss = 0.0
    for i, (images, params) in enumerate(trainloader):
        optimizer.zero_grad()

        output = model(images)
        loss = criterion(output, params)
        loss.backward()
        optimizer.step()

        tloss += loss.item() * len(images)

    print( f"Epoch {epoch+1}: Loss = {tloss}" )

Testing

total_loss = 0
model.eval()
with torch.no_grad():
    for images, params in testloader:
        output = model(images)
        loss = criterion(output, params)
        total_loss += loss * len(images)

Loss Functions and Evaluation

Accuracy: ratio of correctly classified items
What is the difference between a rate and a probability?
Statistics
- Standard deviation
- Hypothesis Tests
- Confidence Interval
Other heuristics

Computational Power

Neural Networks are Computationally Expensive
GPU or CPU - what’s the difference?
- what resources do you have?
Remedies
- Reduce image resolution
- Reduce number of images
- Reduce number of epochs
In particular, it is necessary to sacrifice accuracy during development and testing.
In the final stages you may need big datasets to achieve satisfactory results, and then you may need more computing power.