Let denote data, and the parameters of our model.

  • Bayesian represent uncertainty by treating as a random variable with prior distribution. Parameters and data are often continuous-values.
  • let p() be the prior PDF (Probability density function).
  • In simple terms. Once the data is observed, we want to update our belief about given the new evidence. This updated belief is represented by the posterior distribution, denoted as
  • Return the posterior distribution of given data :
  • Bayesian estimation does not operate through optimisation to obtain best parameter values.
  • is called marginal likelihood, or evidence.
  • Predictive distributions for a new example:
    • in supervised learning:
    • in unsupervised learning: