Thursday, November 11, 2004

Some Statistics on the Exit Poll Mystery

Steven Freeman of the University of Pennsylvania has written an interesting little study about the exit polls (a pdf file here). What he does is ask and answer the following question: If the reported election results were correct in Ohio, Pennsylvania and Florida, what is the likelihood that from this true population of votes we would draw the three samples of exit polls in the same three states that we actually drew? This is how statisticians test hypotheses or theories. The idea is very simple: if it is extremely unlikely that the exit polls in those states reflect the same population of data as the reported election results, then our conclusion should be that they do not come from the same populations. In other words, either the exit polls were rigged or the election results were.

Freeman does the required calculations and finds that in each of the three states the test rejects the possibility that the exit polls describe the same universe as the final results (at p=0.01 level). Also,

The likelihood of any two of these statistical anomalies occurring together is on the order of of one-in-a-million. The odds against all three occurring together are 250 million to one. As much as we can say in social science that something is impossible, it is impossible that the discrepancies between predicted and actual vote counts in the three critical battleground states of the 2004 election could have been due to chance or random error.

So. I really enjoyed writing this, because statistics happens to be one of my many specialties! But if you didn't enjoy reading it as much, what it really says is that it's impossible for the exit polls to be so much off for the usual reasons that polls are off.

Instead, two other explanations need to be analyzed: Either the exit polls were wrong for some reason that biased them all towards Kerry (such as rigging by Democrats to stop Republicans in the West from voting or some odd refusal bias in answers by Republicans etc.) or the election results themselves are incorrect. Or both, I guess.

Added: After thinking about it I believe that we can discount the refusal bias of Republicans as a possible explanation. For why would they refuse in only these states and not in others? It doesn't make sense. That leaves the theories that the Democrats rigged the exit poll or that the results themselves are wrong.

I also don't know if Freeman's use of a random sample model is justifiable. The exit polls use some kind of clustering. But it's hard to see how the figures would change enough to change his conclusions. Though who knows.