In finding P(A|B), we use P(B|A) and vice versa. How do we find either without knowing one already?



Bayes’ theorem provides a way of computing P(A|B) but uses P(B|A). How would we go about finding either without one being given initially?


To think about this question more clearly, we need to think about the intersection: P(A ∩ B). Recall that P(A ∩ B) is the probability that both A and B are true. We learned about P(A ∩ B) when A and B were independent but it also makes sense to think about the intersection in other cases.

First, let’s rephrase the conditional probability in plain language. P(A|B) is asking:

What is the probability that A will happen if we already know that B happened?

Written like this makes it clear that we’re asking a question about both A and B happening. So P(A|B) is related to P(A ∩ B) in some way but how exactly? There certainly not equal. This is where our knowledge that B already happened comes into play. Since B happened, the probability that we’re computing is essentially no longer between 0 and 1 but instead between 0 and P(B). Tying this all together, we can rewrite P(A|B) as follows

P(A|B) = P(A ∩ B) / P(B)

This presents us with another way of computing conditional probabilities: we can compute the probability of the intersection.

FAQ: Bayes' Theorem - Bayes' Theorem