Uncertainty refers to situations where the outcome or state of system is not fully known. There are many sources of uncertainty.
- Data Uncertainty - Noisy or incomplete data.
- Model uncertainty - Limited or incorrect model assumptions.
- Environmental Uncertainty - Dynamic, unpredictable environments.
Predictive uncertainty
Suppose we trained a model where denotes the parameters of the model obtained on the training set, and is a random test sample.
Predictive uncertainty refers to the distribution of the residual:
Some of the predictive uncertainty can be reduced, but some canβt. Based on this, we talk about two main types of uncertainty.
Predictive uncertainty = Aleatoric Uncertainty + Epistemic Uncertainty
Aleatoric Uncertainty
Aleatoric Uncertainty cannot be reduced. This is because it arises due to intrinsic randomness in the data or process.
Epistemic Uncertainty
This can be reduced. This is because this arises due to the lack of knowledge, and is reducible with more data or improved models.
Bayesian vs Frequentist Frameworks
There are two primary approaches for inference: Bayesian and Frequentist Each framework relies on a different philosophical perspective on probability and modelling, leading to different techniques and interpretations.
Bayesian Framework
Uncertainty is represented as a probability distribution (from some prior belied, which incorporates prior knowledge). This distribution is updated based on new data.
Parameters are treated as random variables, and data is seen as non-random.
Probability is interpreted as a degree of belief. We can quantify and update epistemic uncertainty.
Frequentist Framework
Uncertainty is based on the long-run frequencies of outcomes.
Data is treated as a random draw from some (unknown) distribution, and the model parameters are non-random. This intuitively just means there is some optimal set of model parameters, we just donβt know them.
We rely on observed data to estimate (fit) fixed model parameters.
Probability is interpreted as the long-run frequency of events. Uncertainty is captured by confidence levels or p-values, not belief.