# Neural Networks

## Changes from 84c00db429f7d777d256a73489eb9606314de5e3 to 8e1c340e28cd8d0ec9d3ff801380dc0d165a5594

---
title: Neural Networks
categories: session
---

You should be content if you do one exercise in a day; some may
be able to complete two in a day.

+ [Briefing 10 November](ANN)
+ [Briefing 11 November](CNN)

+ [PyTorch Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)
+ [Learn the Basics](https://pytorch.org/tutorials/beginner/basics/intro.html)
+ Szeliski 2022 Chapter 5

# Exercise 1. Basic tutorial.

I have added a couple of exercises to the official
[PyTorch Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html).

+ Please reflect upon and discuss the questions.

# Exercise 2. Convolutional Neural Networks

1. Complete the [CIFAR-10 Tutorial](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)
2. Compare the approach to Exercise 1.  What is different?
What is the same?

# Exercise 3. Regression.

One in-house project is to classify images of remote galaxies,
which are often distorted due to the gravity of dark matter.
This effect is known as gravitational lensing.

A sample dataset can be found at
[github](https://github.com/CosmoAI-AES/datasets2022/Exercise2022).
This directory contains

+ A data file, sphere-pm.csv
+ 10000 images of distorted galaxies
+ A python file Dataset.py defining a subclass of Dataset
to manage these data

The CSV file has the form

index,filename,source,lens,chi,x,y,einsteinR,sigma,sigma2,theta,nterms
"00001",image-00001.png,s,p,50,30,40,19,31,0,0,16

The interesting columns are the filename which points to the
input image,  and
the four output variables $x$, $y$, einsteinR, and $\sigma$.
The other columns are associated with more advanced problem instances
and should be ignored.
# Exercise 3. Evaluation and Analysis

1.  Study the Dataset class Dataset.py.  How is the dataset managed?
2.  Use this class to test if you can train a network to determine the
four outputs for an image.
1.  Copy the code out of the tutorials so that you can run it
in your regular environment (IDE, command line, etc.).
2.  Test one of the networks with different numbers of epochs.
Record the accuracy both on the training set and on the testing
set for each epoch.
3.  Plot the training and testing accuracy as a function of the number
of epochs.  What do you see?
4.  What is the ideal number of epochs?
5.  Calculate a confidence interval for the accuracy at the ideal
number of epochs.  What do you think of the quality of the network?
What do you think about the quality of the assessment of the network?

# Exercise 4.  Own Data (Optional)
Note that you do not have to rerun the entire process for more epochs.
You can test the network after each epoch and continue training.

Making your own training data may be difficult, because of the
large quantity of data that you need, but if you want to give it
a try, I would recommend this.
# Exercise 4.  Different networks (optional)

1.  Team up with everybody else who wants to do this problem.
2.  Each team member makes a set of hand-drawn digits, at least
ten versions of each digit each.
3.  Digitise the digits into image files.  Make sure that everybody
uses the same resolution.
+ The resolution must be large enough to make digits recognisable,
but small enough to save computational power.
+ Somewhere between $28\times28$ (same as the MIST dataset)
and $64\times64$ may be considered.
4.  Gather the images into a dataset with their labels (digit value).
You may use Exercise 3 as an example.
5.  Merge and share the dataset.
6.  Train and test a neural network to read the digits.
7.  Try different networks, at least the ones you have used in previous
exercises.
Try to change the neural network in Exercise 1 and 2.
Can you improve the performance?

# Exercise 5.  More examples (optional)

Other datases may be found at