Revision 469bffa2e6eb4b4883626a8d0d274ef166f478f7 (click the page title to view the current version)


Changes from 469bffa2e6eb4b4883626a8d0d274ef166f478f7 to 0d3d0cc8cf25b1a3eb2162c04111f5e22de37809

title: Machine Learning and Statistics
categories: session

# Reading

+ R&N Chapter 19.  (19.3-19.5 only cursory)

# Briefing

+ AI and Cybernetic Systems are often described as either model-driven
  or data-driven.
    - what is the difference?
    - examples?
+ Note that there is not necessarily a sharp boundary between the two.
  We may have partial models in a data-driven approach.

## Florence Nightinggale

![The Lady with the Lamp](

+ Nurse in the Crimean War 1853-56
+ First female member of the Royal Statistical Society (1859)
+ A pioneer in using statistics to make politics
+ Key to sanitary reforms in the British Army

> If doctors wash their hands more frequently, there are fewer deaths
  in their ward.

+ This is a simple quantitative problem.  
+ count hand washing events per ward
+ count death events per ward
+ compare the numbers
+ A key contribution of Ms Nightingale's was the visualisation of
  these numbers to make the authorities understand their implication.

## Modelling it

Today we would say that Ms Nightingale's observation is obvious.
We have a simple *causal* model to say that doctors should wash their

1.  Viruses and bacteria in wounds cause infection and death.
2.  Viruses and bacteria are carried around by dirty hands.
3.  Hence dirty hand cause death.
4.  Viruses and bacteria can (largely) be washed off.
5.  Clean hands carry fewer viruses and bacteria around than dirty hands.
6.  Hence, clean hands give less infection and death than dirty hands.

This causal model was not immediately available to Ms Nightingale, and
therefore she chose a data-driven model.  She observed that

1.  Some wards have few hand washes and few deaths.
1.  Some wards have many hand washes and many deaths.
1.  Wards with many hand washes and few deaths or vice versa are
2.  Some wards have many hand washes and many deaths.
3.  Wards with many hand washes and few deaths or vice versa are
    very rare.
2.  We infer that there is an increasing function $y=f(x)$ giving
4.  We infer that there is an increasing function $y=f(x)$ giving
    the approximate number of deaths $y$ as a function of the number
    of hand washes $x$.

This is an example of regression analysis.
The resulting model is one of corelation; 
certain events, such as dirty hands and deaths, tend to co-occur.
No causality is implied.

## Big Data

What is the difference between statistics and machine learning?

+ Basically, statistics can be calculated by hand.
    + Florence Nightingale only observed two variables in the given
+ In machine learning we study data sets too large for manual
+ Degrees of big data.  
    + Modern methods achieve results which were not possible ten years ago.  
    + What can we do ten years from now?

## Machine Learning

+ Essentially we solve the same problem as Florence Nightingale
    - observe some variables that we can control (hand washing)
    - observe some variables that we want to control (deaths)
    - predict how the former set influences the latter set
    - use this information to control what we want to control
+ Alternatively,
    - observe some variables that are easily observed
    - observe some variables that cannot always be observed
    - find the relationship between the two sets
    - use the observable information to predict the inobservable
+ **Attention** To build the model we need to observe the inobservable
    - inobservable may mean observable only in hindsight
    - historical data for training
    - in the future we need predictions before the observations become
+ This is the case for many intelligent agent systems
    - we want to predict the payoff of potential actions
    - we cannot observe our own action, but the payoff is only
      observable after we have acted
    - however, we can observe both action and payoff in previous
+ Machine Learning is always a question of modelling the relationship
  between the observable and the inobservable

## Types of Machine Learning

+ Two main problems
    - regression
    - classification
+ Three classes
    - supervised (with access to some ground truth)
    - unsupervised (without ground truth)
    - reinforcement learning

## Machine Learning as an Optimisation Problem

### Regression

### Classification

## Algorithms

+ ANN - Artificial Neural Networks
+ SVM - Support Vector Machines
+ PCA - Principal Component Analysis