Revision 469bffa2e6eb4b4883626a8d0d274ef166f478f7 (click the page title to view the current version)

Machine Learning and Statistics

Reading

  • R&N Chapter 19. (19.3-19.5 only cursory)

Briefing

  • AI and Cybernetic Systems are often described as either model-driven or data-driven.
    • what is the difference?
    • examples?
  • Note that there is not necessarily a sharp boundary between the two. We may have partial models in a data-driven approach.

Florence Nightinggale

The Lady with the Lamp
The Lady with the Lamp
  • Nurse in the Crimean War 1853-56
  • First female member of the Royal Statistical Society (1859)
  • A pioneer in using statistics to make politics
  • Key to sanitary reforms in the British Army

If doctors wash their hands more frequently, there are fewer deaths in their ward.

  • This is a simple quantitative problem.
  • count hand washing events per ward
  • count death events per ward
  • compare the numbers
  • A key contribution of Ms Nightingale’s was the visualisation of these numbers to make the authorities understand their implication.

Modelling it

Today we would say that Ms Nightingale’s observation is obvious. We have a simple causal model to say that doctors should wash their hands.

  1. Viruses and bacteria in wounds cause infection and death.
  2. Viruses and bacteria are carried around by dirty hands.
  3. Hence dirty hand cause death.
  4. Viruses and bacteria can (largely) be washed off.
  5. Clean hands carry fewer viruses and bacteria around than dirty hands.
  6. Hence, clean hands give less infection and death than dirty hands.

This causal model was not immediately available to Ms Nightingale, and therefore she chose a data-driven model. She observed that

  1. Some wards have few hand washes and few deaths.
  2. Some wards have many hand washes and many deaths.
  3. Wards with many hand washes and few deaths or vice versa are very rare.
  4. We infer that there is an increasing function \(y=f(x)\) giving the approximate number of deaths \(y\) as a function of the number of hand washes \(x\).

This is an example of regression analysis.

Big Data