So far we have a theory for finding the most likely model, but not how likely the model is. Bayes' theorem tells us how to calculate the probability of a model.
Bayes' theorem is about conditional probabilities so I will explain these first. Probability is about sets of outcomes. We start by assuming that these outcomes are equally likely. Suppose we have a bag full of balls, each ball is either red or blue. Each ball is also either Small or Big. Taking a ball from the bag is an outcome.
Red
Blue
Total
Small
20
40
60
Big
10
30
40
Total
30
70
100
The conditional probability of a ball taken from the bag being Red if we already know it is Big is 10/40. This is written,
These are conditional probabilities. means,
First I found that the ball was Big. What then is the probability of it being red.
The probabilities for a a ball being red is,
Note that has no meaning by itself. Instead probability has two sets,
The set of events that register success.
The domain from which those events are taken.
Note that,
The probabilities for a a ball being Big is,
Now the probability of a ball being Red and Big is,