Maybe we should put rats in charge of foreign aid research

Rat.jpg Laboratory experiments show that rats outperform humans in interpreting data, which is why we have today the US aid agency, the Millennium Challenge Corporation. Wait, I am getting ahead of myself, let me explain.

The amazing finding on rats is described in an equally amazing book by Leonard Mlodinow. The experiment consists of drawing green and red balls at random, with the probabilities rigged so that greens occur 75 percent of the time. The subject is asked to watch for a while and then predict whether the next ball will be green or red. The rats followed the optimal strategy of always predicting green (I am a little unclear how the rats communicated, but never mind). But the human subjects did not always predict green, they usually want to do better and predict when red will come up too, engaging in reasoning like “after three straight greens, we are due for a red.” As Mlodinow says, “humans usually try to guess the pattern, and in the process we allow ourselves to be outperformed by a rat.”

Unfortunately, spurious patterns show up in some important real world settings, like research on the effect of foreign aid on growth. Without going into any unnecessary technical detail, research looks for an association between economic growth and some measure of foreign aid, controlling for other likely determinants of economic growth. Of course, since there is some random variation in both growth and aid, there is always the possibility that an association appears by pure chance. The usual statistical procedures are designed to keep this possibility small. The convention is that we believe a result if there is only a 1 in 20 chance that the result arose at random. So if a researcher does a study that finds a positive effect of aid on growth and it passes this “1 in 20” test (referred to as a “statistically significant” result), we are fine, right?

Alas, not so fast. A researcher is very eager to find a result, and such eagerness usually involves running many statistical exercises (known as “regressions”). But the 1 in 20 safeguard only applies if you only did ONE regression. What if you did 20 regressions? Even if there is no relationship between growth and aid whatsoever, on average you will get one “significant result” out of 20 by design. Suppose you only report the one significant result and don’t mention the other 19 unsuccessful attempts. You can do twenty different regressions by varying the definition of aid, the time periods, and the control variables. In aid research, the aid variable has been tried, among other ways, as aid per capita, logarithm of aid per capita, aid/GDP, logarithm of aid/GDP, aid/GDP squared, [log(aid/GDP) - aid loan repayments], aid/GDP*[average of indexes of budget deficit/GDP, inflation, and free trade], aid/GDP squared *[average of indexes of budget deficit/GDP, inflation, and free trade], aid/GDP*[ quality of institutions], etc. Time periods have varied from averages over 24 years to 12 years to to 8 years to 4 years. The list of possible control variables is endless. One of the most exotic I ever saw was: the probability that two individuals in a country belonged to different ethnic groups TIMES the number of political assassinations in that country. So it’s not so hard to run many different aid and growth regressions and report only the one that is “significant.”

This practice is known as “data mining.” It is NOT acceptable practice, but this is very hard to enforce since nobody is watching when a researcher runs multiple regressions. It is seldom intentional dishonesty by the researcher. Because of our non-rat-like propensity to see patterns everywhere, it is easy for researchers to convince themselves that the failed exercises were just done incorrectly, and that they finally found the “real result” when they get the “significant” one. Even more insidious, the 20 regressions could be spread across 20 different researchers. Each of these obediently does only one pre-specified regression, 19 of whom do not publish a paper since they had no significant results, but the 20th one does publish their spuriously “significant” finding (this is known as “publication bias.”)

But don’t give up on all damned lies and statistics, there ARE ways to catch data mining. A “significant result” that is really spurious will only hold in the original data sample, with the original time periods, with the original specification. If new data becomes available as time passes you can test the result with the new data, where it will vanish if it was spurious “data mining”. You can also try different time periods, or slightly different but equally plausible definitions of aid and the control variables.

So a few years ago, some World Bank research found that “aid works {raises economic growth} in a good policy environment.” This study got published in a premier journal, got huge publicity, and eventually led President George W. Bush (in his only known use of econometric research) to create the Millennium Challenge Corporation, which he set up precisely to direct aid to countries with “good policy environments.”

Unfortunately, this result later turned out to fail the data mining tests. Subsequent published studies found that it failed the “new data” test, the different time periods test, and the slightly different specifications test.

The original result that “aid works in a good policy environment” was a spurious association. Of course, the MCC is still operating, it may be good or bad for other reasons.

Moral of the story: beware of these kinds of statistical “results” that are used to determine aid policy! Unfortunately, the media and policy community don’t really get this, and they take the original studies at face value (not only on aid and growth, but also in stuff on determinants of civil war, fixing failed states, peacekeeping, democracy, etc., etc.) At the very least, make sure the finding is replicated by other researchers and passes the “data mining” tests.

In other news, anti-gay topless Christian Miss California could be a possible candidate for a new STD prevention campaign telling all right-wing values advocates: “abstain, or the left-wing media will catch you.”