Statistical Evaluation

Evaluating Machine Learning Models

Hans Georg Schaathun

NTNU, Noregs Teknisk-Naturvitskaplege Universitet

22 March 2023

How do you know if your machine learning model is good?

  • Suppose classification
  • Each object is either
    • correctly classified
    • misclassified
  • Step 1 (Training)
    • Use training data $(\vec{x'}_i,\vec{y'}_i)$ to find a model $f_{\vec{c}}$.
    • Count errors: $N$ objects, $F$ misclassified, error rate $\rho=F/N$
  • Step 2 (Testing)
    • Independent dataset $(\vec{x'}_i,\vec{y'}_i)$ to estimate the error
    • Count errors: $N'$ objects, $F'$ misclassified, error rate $\rho'=F'/N'$
  • Step 3 (Validation)
    • Third dataset $(\vec{x''}_i,\vec{y''}_i)$
    • Count errors: $N''$ objects, $F''$ misclassified, error rate $\rho''=F''/N''$
What do the error rates $\rho$, $\rho'$, $\rho''$ tell us?
  • Calculate error rate $\rho''$ from an independent sample (data set).
  • What does the error rate $\rho''$ tell us about the error probability $p_E$?

Absurd example

  1. Suppose you test on $N''=1$ items.
  2. You are lucky, the first item is correctly classified.
  3. $\rho''=\frac0N=0$.
  4. Is the error probability zero?

Statistics and Stochastic Variables

  • The error rate is a stochastic variable
  • Different value when you repeat the experiment
  • Standard deviation

$$\hat\sigma = \sqrt{\frac{\rho(1-\rho)}{N}}$$

  • About 2/3 of observations within $\pm1\sigma$
  • About $95\%$ of observations within $\pm2\sigma$
  • More than $99.5\%$ of observations within $\pm3\sigma$