Thursday, September 06, 2007

Beautiful People Revisited

This is yet another post on evolutionary psychology studies, this time on the subclass I call Evolutionary Psychology (EP), a political endeavor rather than a scientific one. Professor Satoshi Kanazawa is an ardent proponent of EP, with a large number of relevant studies under his belt. I earlier wrote a series of posts on his recent Psychology Today article (this link will take you to the last one which links to the earlier ones).

Now Kanazawa (with Alan S. Miller) has come out with a new book about, among other things, why beautiful people tend to have more daughters than sons. The reasons are naturally to do with evolutionary psychology.

The snag is that Professor Kanazawa's studies have not actually proven that beautiful people have more daughters. Never mind, that is no hindrance for writing a book about the theory, I guess. But it should be a hindrance for arguing that empirical evidence supports his theory. It should also be a hindrance for the general popularization of Kanazawa's ideas as something supported by evidence. Not that any of these hindrances seem to have mattered much so far.

Professor Andrew Gelman has written an article on what is wrong with Kanazawa's empirical research into various EP topics. A shorthand-way of understanding some of the problems can be gained from Professor Gelman's blog post on the "beautiful daughters" topic. The post is provoked by one of those popularizations which argues that Kanazawa has indeed found that beautiful parents have more daughters, by simply listing some celebrities who are good-looking and also have daughters. Gelman's answer:

Actually, we looked up a few years of People Magazine's 50 most beautiful people, and they were as likely as anyone else to have boys:

One way to calibrate our thinking about Kanazawa's results is to collect more data. Every year, People magazine publishes a list of the fifty most beautiful people, and, because they are celebrities, it is not difficult to track down the sexes of their children, which we did for the years 1995–2000.

As of 2007, the 50 most beautiful people of 1995 had 32 girls and 24 boys, or 57.1% girls, which is 8.6 percentage points higher than the population frequency of 48.5%. This sounds like good news for the hypothesis. But the standard error is 0.5/sqrt(56) = 6.7%, so the discrepancy is not statistically significant. Let's get more data.

The 50 most beautiful people of 1996 had 45 girls and 35 boys: 56.2% girls, or 7.8% more than in the general population. Good news! Combining with 1995 yields 56.6% girls—8.1% more than expected—with a standard error of 4.3%, tantalizingly close to statistical significance. Let's continue to get some confirming evidence.

The 50 most beautiful people of 1997 had 24 girls and 35 boys—no, this goes in the wrong direction, let's keep going . . . For 1998, we have 21 girls and 25 boys, for 1999 we have 23 girls and 30 boys, and the class of 2000 has had 29 girls and 25 boys.

Putting all the years together and removing the duplicates, such as Brad Pitt, People's most beautiful people from 1995 to 2000 have had 157 girls out of 329 children, or 47.7% girls (with standard error 2.8%), a statistically insignificant 0.8% percentage points lower than the population frequency. So nothing much seems to be going on here. But if statistically insignificant effects with a standard error of 4.3% were considered acceptable, we could publish a paper every two years with the data from the latest "most beautiful people."

You might want to re-read that quote, because it's a very good example why we are supposed to not pick data for studies by looking at it and selecting the bits that look good to us. Random sampling and large sample sizes are requirements which exist for a very good reason. In their absence it is very hard not to be guilty of data mining or data phishing, and once we start on that road we can "prove" an awfully large number of things.

A slightly different example might help in understanding some of these problems. Suppose that you want to prove how careful you are with money and how well you stay within your budget. You look at your old records for, say, ten years, and find that you have done much better during some years than other years. Wouldn't it be nifty if you could cut out some of those bad years from your study altogether? Yes, it probably would be nifty, but it would not be good statistics.

Now, Professor Gelman does not argue that anyone is doing this sort of stuff. His point is that a weak statistical analysis should make people stop and think before generalizing the results to wider populations.

Gelman's pdf article, well worth reading even if you are not statistically trained, mentions several other statistical problems which the "speculative studies" professor Kanazawa has carried out contain. A snippet from the end of the piece should whet your appetite (or wet it):

Why does this matter? Why are we wasting our time on a series of papers with statistical errors that happen not to have been noticed by reviewers for a fairly obscure journal? We have two reasons: first, as discussed in the next section, the statistical difficulties arise more generally with findings that are suggestive but not statistically significant. Second, as we discuss presently, the structure of scientific publication and media attention seem to have a biasing effect on social science research.

Before reaching Psychology Today and book publication, Kanazawa's findings received broad attention in the news media. For example, the popular Freakonomics blog (Dubner 2006) reported,

"a new study by Satoshi Kanazawa, an evolutionary psychologist at the London School of Economics, suggests...there are more beautiful women in the world than there are handsome men. Why? Kanazawa argues its because good-looking parents are 36% more likely to have a baby daughter as their first child than a baby son - which suggests, evolutionarily speaking, that beauty is a trait more valuable for women than for men. The study was conducted with data from 3,000 Americans, derived from the National Longitudinal Study of Adolescent Health, and was published in the Journal of Theoretical Biology."

ruling that such research is dangerous and out of bounds, an attitude deplored by Pinker (2007). The aforementioned Freakanomics article concluded, "It is good that Kanazawa is only a researcher and not, say, the president of Harvard. If he were, that last finding about scientists may have gotten him fired." It should be possible to criticize large unproven claims in biology and social science without dismissing the entire enterprise.

Gelman then points out that the "36% more likely" figure mentioned here isn't correct even if correctness is defined by the faulty findings of Kanazawa's actual study. But that's the figure the popularizations eagerly accepted.

Why am I writing about this particular topic again? Consider the facts: A new book by Kanazawa has just come out, a book with a title all about why beautiful people have more daughters. Yet all the time Kanazawa's own research cannot even prove the title he uses. What's more, the discussions about the book are likely to just start with the assumption that Kanazawa must have the empirical support on his side. After all, anonymous reviewers approved his papers for publication! Science cannot err! And so on.

Well, anonymous reviewers are human beings, and anonymous reviewers of a possibly EP journal may share the same underlying desires to find certain theories proved. Anonymous reviewers may also not be experts in statistics. More importantly (and as Professor Gelman also notes), no journal really wants to publish an article with the title "Beautiful People No More Likely To Have Daughters". I believe that the academic publishing process has an in-built bias against studies which appear to find no difference.

What they should have is an in-built bias against publishing iffy research.