Thursday, June 5, 2008

Absence of evidence in the Bayesian

Here's a little something that maybe you didn't know about induction. Let's say I have evidence B. I can use this evidence B to argue inductively for claim A. Evidence B doesn't prove A, but it does make A more likely. So what happens if I instead have evidence not-B. That is, I've looked, and found that evidence B is absent. Does that make not-A more likely?

In other words, does absence of evidence amount to evidence of absence? Yes. And I can prove it mathematically. [I mention math and thus lose half my readers... Skip the proof section if you must.]

The Proof

Good to read first: Induction and the Bayesian

Bayes' theorem states the following:
P(A|B) = \frac{P(B | A)\, P(A)}{P(B)}.
P(A|B) is equal to the probability that claim A is true if we find evidence B. P(A) is the "prior" probability that A is true. P(B|A) is the probability of finding evidence B if we know A is true. P(B) is the "prior" probability of finding evidence B. If B is evidence for A, then P(A|B) > P(A). If not-B (written as ~B) is evidence for not-A, then P(~A|~B) > P(~A). Thus we seek to prove the following:

If P(A|B) > P(A), then P(~A|~B) > P(~A)

We will use, in addition to Bayes' theorem, the following identities:

~(~X) = X
P(~X) = 1 - P(X)
P(~X|Y) = 1 - P(X|Y)

This theorem assumes two things. First, none of the prior probabilities can be zero. For example, if P(B) = 0, Bayes' theorem doesn't even make sense anyways, since it divides by zero. Second, it assumes that probability is a good way to model knowledge.*

Proof:
  1. Start with Bayes' theorem: P(A|B) = P(B|A)*P(A)/P(B)
  2. We're given: P(A|B) > P(A)
  3. Combining 1 and 2: P(B|A)*P(A)/P(B) > P(A)
  4. Multiply by P(B): P(B|A)*P(A) > P(B)*P(A)
  5. Use the identities: [1-P(~B|A)]*P(A) > [1-P(~B)]*P(A)
  6. Some algebra: P(~B)*P(A) > P(~B|A)*P(A)
  7. Use Bayes' theorem: P(~B)*P(A) > P(A|~B)*P(~B)
  8. Use the identities: P(~B)*[1-P(~A)] > [1-P(~A|~B)]*P(~B)
  9. Some algebra: P(~B)*P(~A|~B) > P(~A)*P(~B)
  10. Divide out by P(~B): P(~A|~B) > P(~A)
QED

*Note: Some people might consider this second assumption questionable. Under certain interpretations, probability is only reliable in analyzing repeatable phenomena, and the universe is not repeatable.

Discussion and Conclusion

What does this mean? It means that if the existence of some evidence supports a claim, then the non-existence of that evidence detracts from the claim. This is a logical necessity in induction. The only assumptions are that we can model our knowledge with probabilities, and that none of the prior probabilities are certain.

This directly contradicts the conventional wisdom that "Absence of evidence is not evidence of absence". So where did this conventional wisdom come from?

There are two justifications I can think of. First, "absence of evidence" might mean that we neither know whether there is or there isn't evidence, because we haven't looked. In that case, the conventional wisdom is true. Second, though I proved that absence of evidence is evidence of absence, I did not prove that it's very good evidence of absence. For example, if I found bigfoot behind a tree, that would provide extremely good evidence for bigfoot, but if I didn't find him behind a tree, that would provide very weak evidence against bigfoot. But it's still evidence, mathematically speaking. I've previously explained this asymmetry as the basis for the concept of "burden of proof".

So, I wasn't lying when I said the Bayesian gives us insight into the inner workings of reason! This is just one of the reasons that math is cool.

8 comments:

Anonymous said...

Yay for math!

I'm familiar with this notion, but I've sometimes heard it described by example. Say we want to confirm the statement that all swans are white (not actually true, but we'll pretend it is for now). Then seeing a white swan would be evidence of this claim. But seeing a black crow would also be evidence for the claim, since you've seen a non-white thing that turned out to be a non-swan.

miller said...

I think it's a little problematic to simply say that black crows are evidence that swans are white. We need to specify exactly what observation we are talking about. Let's imagine that you see a black thing off in the distance, perhaps a swan, perhaps not. The mere possibility that it is a swan can be evidence against our claim. If we go on to observe that it is a crow rather than a swan, we've acquired evidence for our claim.

Of course, all the evidence here is extremely weak. I do not believe in a sharp dichotomy between "weak" and "strong" evidence, but if there were one, this would definitely fall in the weak category!

Linda said...

For example, if I found bigfoot behind a tree, that would provide extremely good evidence for bigfoot, but if I didn't find him behind a tree, that would provide very weak evidence against bigfoot. But it's still evidence, mathematically speaking.

ummm... what happens if some of us saw bigfoot behind the tree and some of us didn't?

miller said...

When I spoke of bigfoot, I was thinking of a very mathematically idealized scenario. In reality, we need to consider the possibility of hoaxes and false positives. If multiple independent witnesses see undeniable and self-consistent evidence of bigfoot, that would suffice to overcome most contrary evidence.

Or did you mean that different people look in the same place, and only some of them see bigfoot? I'm not sure how that could happen, unless we're talking about pareidolia. Pareidolia is when pattern recognition misfires (ie, seeing Jesus in a crepe or bigfoot on mars).

Alan said...

This result and the credo that "absence of evidence is not evidence of absence" are not contradictory. I see the latter as a statement of logic. Essentially, if A -> B, it does not follow that ~A -> ~B.

Bayes theorem deals in probabilistic reasoning which isn't quite the same, as it involves degrees of certainty, not absolutes.

To illustrate the difference, imagine you are shown 100 upside-down cups and told that there may be a ball under one or more of them. If you turn over 99 cups and still haven't found a ball, you will probably assign a low probability to the existence of a ball, but you cannot say absolutely that one doesn't exist, because there is still one cup remaining.

miller said...

I already agree with you. I usually take "evidence" to mean the probabilistic kind rather than the absolute kind. But if by "evidence" we mean absolute proof, then of course, "absence of evidence is not evidence of absence" is correct.

Anonymous said...

Talking about Bigfoot behind a tree isn't too interesting an example because people have already looked behind virtually all "trees" on the planet. Each "tree" provided a very, very small evidence of absence, but by the time almost all trees have been looked behind ...
To me, more interesting real life examples would be the questions of whether their is life on other planets, or intelligent life. Not only is there no convincing evidence that no alien intelligent life has visited earth, but we haven't found radio transmissions yet.

miller said...

Lol, so it's not just me who thinks bigfoot is boring! I intended it to be a boring example, so as not to detract from the more general point. SETI is another topic all to itself!

As far as SETI is concerned, I don't think it has provided strong evidence that there is no intelligent life out there. Its search space is far too small. However, I think it does suggest that more of the same kind of searching is unlikely to turn up anything.