Bootstrapping ensembles repeatedly resample the training set, train a separate model on each sample, and average their predictions. This can reduce variance.
Bootstrap for Estimating Epistemic Uncertainty
Predictive uncertainty can be written as a sum of aleatoric and epistemic uncertainty :
where are the true parameters of the function we try to learn : Hard to disentangle the two uncertainties of different nature :
- We directly observe the predictive uncertainty
- But is unknown.
Epistemic Uncertainty
- The epistemic uncertainty is :
Note Epistemic(x) is a random function of the random sample S. Thus Epistemic(x) is itself a random variable
- Goal : Estimate the distribution
- Assumption : Our model is unbiased (i.e. we need to assume its Bias is zero).
- But but is an unknown distribution.
What if we know ?
- we could take (a very large number of) samples of size , each i.i.d. from , train the model on each.
- Then are i.i.d. samples from .
- By our unbiasedness assumptions :
- are approximately i.i.d samples from .
- Problem : We donβt have access to unlimited number samples from , but only have a single sample, .
Bootstrap
- Idea : Given samples , we can approximate the unknown probability distribution by (here, , denotes a Dirac delta centred on ):
- This can be made to work for continuous distributions using the probability density function :
- Now use this in place of the unknown .
Practical Algorithm of Estimating Epistemic Uncertainty
Algorithm Bootstrap (sample )
- Define empirical distribution :
- Generate bootstrap datasets .
- Train models on each dataset :
- Return epistemic uncertainty estimate :
where