- Confidence bounds are frequentist guarantees - including both Test Set Bound, and Train Set Bounds.
- However, we will take and adapt a Bayesian idea,
- Before seeing the training set , we define a “prior” probability distribution over the classifiers, . This represents our “bet” on the candidate classifiers.
- Suppose a simple learning problem where we have a countable number of classifiers for the learning algorithm to choose from based on a training sample .
Theorem
For all “priors” over a countable set of classifiers, for all distributions , for all :
Compare with the Test Set Bound : . Corollary For all , all , all :
Tightness depends on the self-information of the classifier returned by the learning algorithm, with respect to our prior bets.
Occam’s Razor Bound : Proof
Idea : without training apply the Test Set Bound to each in turn with confidence parameter :
Negate to get equivalent statement :
Apply union bound : repeatedly.