Let denote data, and the parameters of our model.
- Bayesian represent uncertainty by treating as a random variable with prior distribution. Parameters and data are often continuous-values.
- let p() be the prior PDF (Probability density function).
- In simple terms. Once the data is observed, we want to update our belief about given the new evidence. This updated belief is represented by the posterior distribution, denoted as
- Return the posterior distribution of given data :
- Bayesian estimation does not operate through optimisation to obtain best parameter values.
- is called marginal likelihood, or evidence.
- Predictive distributions for a new example:
- in supervised learning:
- in unsupervised learning: