Revision 8e1c340e28cd8d0ec9d3ff801380dc0d165a5594 (click the page title to view the current version)

Regression

Changes from beginning to 8e1c340e28cd8d0ec9d3ff801380dc0d165a5594

---
title: Neural Networks for Regression
categories: session
---

# Reading

+ Szeliski 2022 Chapter 5

# Exercise 3. Regression.

One in-house project is to classify images of remote galaxies,
which are often distorted due to the gravity of dark matter.
This effect is known as gravitational lensing.

A sample dataset can be found at
[github](https://github.com/CosmoAI-AES/datasets2022/Exercise2022).
This directory contains

+ A data file, `sphere-pm.csv` 
+ 10000 images of distorted galaxies
+ A python file `Dataset.py` defining a subclass of `Dataset`
  to manage these data

The CSV file has the form
```
index,filename,source,lens,chi,x,y,einsteinR,sigma,sigma2,theta,nterms
"00001",image-00001.png,s,p,50,30,40,19,31,0,0,16
```
The interesting columns are the filename which points to the
input image,  and
the four output variables $x$, $y$, `einsteinR`, and $\sigma$.
The other columns are associated with more advanced problem instances
and should be ignored.

1.  Study the Dataset class `Dataset.py`.  How is the dataset managed?
2.  Use this class to test if you can train a network to determine the
    four outputs for an image.

# Exercise 4.  Own Data (Optional)

Making your own training data may be difficult, because of the
large quantity of data that you need, but if you want to give it
a try, I would recommend this. 

1.  Team up with everybody else who wants to do this problem.
2.  Each team member makes a set of hand-drawn digits, at least
    ten versions of each digit each.
3.  Digitise the digits into image files.  Make sure that everybody
    uses the same resolution.  
    + The resolution must be large enough to make digits recognisable,
      but small enough to save computational power.
    + Somewhere between $28\times28$ (same as the MIST dataset)
      and $64\times64$ may be considered.
4.  Gather the images into a dataset with their labels (digit value).
    You may use Exercise 3 as an example.
5.  Merge and share the dataset.
6.  Train and test a neural network to read the digits.
7.  Try different networks, at least the ones you have used in previous
    exercises.