Revision c82c5ed524e5b1617a3036ab8a76ac16e6e85a45 (click the page title to view the current version)

Neural Networks

Changes from c82c5ed524e5b1617a3036ab8a76ac16e6e85a45 to 84c00db429f7d777d256a73489eb9606314de5e3

---
title: Neural Networks
categories: session
---

The material in this page is meant for 3-4 sessions.
You should be content if you do one exercise in a day; some may
be able to complete two in a day.

+ [Briefing 10 November](ANN)
+ [Briefing 11 November](CNN)

# Reading

+ [PyTorch Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)
+ [Learn the Basics](https://pytorch.org/tutorials/beginner/basics/intro.html)
    + see also the tutorial under Exercise 1. below
+ Szeliski 2022 Chapter 5


# Exercise 1. Basic tutorial.

I have added a couple of exercises to the official
[PyTorch Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html).

+ Download and open the (augmented) [tutorial](ann-tutorial.ipynb).
+ Please reflect upon and discuss the questions.

# Exercise 2. Convolutional Neural Networks

1. Complete the [CIFAR-10 Tutorial](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)
2. Compare the approach to Exercise 1.  What is different?
   What is the same?
   
# Exercise 3. Regression.

One in-house project is to classify images of remote galaxies,
which are often distorted due to the gravity of dark matter.
This effect is known as gravitational lensing.

A sample dataset can be found at
[github](https://github.com/CosmoAI-AES/datasets2022/Exercise2022).
This directory contains

+ A data file, `sphere-pm.csv` 
+ 10000 images of distorted galaxies
+ A python file `Dataset.py` defining a subclass of `Dataset`
  to manage these data

The CSV file has the form
```
index,filename,source,lens,chi,x,y,einsteinR,sigma,sigma2,theta,nterms
"00001",image-00001.png,s,p,50,30,40,19,31,0,0,16
```
The interesting columns are the filename which points to the
input image,  and
the four output variables $x$, $y$, `einsteinR`, and $\sigma$.
The other columns are associated with more advanced problem instances
and should be ignored.

1.  Study the Dataset class `Dataset.py`.  How is the dataset managed?
2.  Use this class to test if you can train a network to determine the
    four outputs for an image.

# Exercise 4.  Own Data (Optional)

Making your own training data may be difficult, because of the
large quantity of data that you need, but if you want to give it
a try, I would recommend this. 

1.  Team up with everybody else who wants to do this problem.
2.  Each team member makes a set of hand-drawn digits, at least
    ten versions of each digit each.
3.  Digitise the digits into image files.  Make sure that everybody
    uses the same resolution.  
    + The resolution must be large enough to make digits recognisable,
      but small enough to save computational power.
    + Somewhere between $28\times28$ (same as the MIST dataset)
      and $64\times64$ may be considered.
4.  Gather the images into a dataset with their labels (digit value).
    You may use Exercise 3 as an example.
5.  Merge and share the dataset.
6.  Train and test a neural network to read the digits.
7.  Try different networks, at least the ones you have used in previous
    exercises.

# Exercise 5.  More examples (optional)

Other datases may be found at 
[this collection](https://paperswithcode.com/datasets?task=image-classification) 
if you want to try other variants.