Bootstrapping ensembles repeatedly resample the training set, train a separate model on each sample, and average their predictions. This can reduce variance.

Bootstrap for Estimating Epistemic Uncertainty

Predictive uncertainty can be written as a sum of aleatoric and epistemic uncertainty :

where are the true parameters of the function we try to learn : Hard to disentangle the two uncertainties of different nature :

  • We directly observe the predictive uncertainty
  • But is unknown.

Epistemic Uncertainty

  • The epistemic uncertainty is :

Note Epistemic(x) is a random function of the random sample S. Thus Epistemic(x) is itself a random variable

  • Goal : Estimate the distribution
  • Assumption : Our model is unbiased (i.e. we need to assume its Bias is zero).
  • But but is an unknown distribution.

What if we know ?

  • we could take (a very large number of) samples of size , each i.i.d. from , train the model on each.
  • Then are i.i.d. samples from .
  • By our unbiasedness assumptions :
  • are approximately i.i.d samples from .
  • Problem : We don’t have access to unlimited number samples from , but only have a single sample, .

Bootstrap

  • Idea : Given samples , we can approximate the unknown probability distribution by (here, , denotes a Dirac delta centred on ):
  • This can be made to work for continuous distributions using the probability density function :
  • Now use this in place of the unknown .

Practical Algorithm of Estimating Epistemic Uncertainty

Algorithm Bootstrap (sample )

  • Define empirical distribution :
  • Generate bootstrap datasets .
  • Train models on each dataset :
  • Return epistemic uncertainty estimate :

where