Art Works Blog

Taking Note: The Third Kind of Lie

Among the many information requests our office fields in a typical week, people occasionally ask us to reconcile sets of research findings, which, at first glance, may seem incompatible.

This happened again quite recentlyonly this time one of the data points in question was ours. The NEA's Survey of Public Participation in the Arts reported last year that the share of the U.S. population who visited an art museum or gallery declined from 22.7 percent in 2008 to 21.0 percent in 2012. The second finding came from the Annual Condition of Museums and the Economy (ACME) survey. Conducted by the American Alliance of Museums, the study concluded that "American museums served more visitors in 2012 than the year before."

This discrepancy reminded me of a quote commonly attributed to Mark Twain (which in turn he attributes to Benjamin Disraeli): "There are three kinds of lies: lies, damned lies, and statistics." How can it be the case that fewer Americans are visiting art museums or galleries and that American museums are serving more visitors? Is someone lying?

While it is certainly true that folks sometimes prevaricate with numbers, here the answer is much simplerand more benign. Even better, the answer provides us with an opportunity to talk about some of the key cautions that you, as a consumer of research, can keep in mind when reading, writing, or talking about that research. And so, reflecting on the kinds of research-related inquiries we commonly address at the NEA, I'd like to flag a couple of things to bear in mind when using research results. 

1. Read the fine print

I know, I know, you're too busy to spend a lot of time looking at the details of how the research was done. But the unfortunate reality is that, if you really want to put research results in perspective, you've got to read the fine print. We've all seen examples of how the fine print on a graph can make a huge difference in the interpretation of the underlying data. For example, one of the headline numbers we presented in last year's report on the 2012 SPPA came in the form of a graph showing overall participation in the "benchmark" arts across the years. Here is the graphic we used in the publication:

SPPA Graph

The numbers in this chart suggest a slow but steady decline in these forms of arts participation over the past 20 years. For the moment, I'm putting aside the nuances we explore in the reportdetails, for example, about the types of arts arts participation that have increased for demographic subgroupsin favor of focusing on the way we presented these numbers. If you take a close look, you'll notice that the scale of the graphic goes from 0 percent to 50 percent participation. What message would we have sent if we had chosen a different scale? 

 

Two graphs

On the left, you see what the graph would have looked like if we had chosen to limit the scale to 32 percent to 42 percent participation. Suddenly, what was a slow but steady decline begins to look more like a collapse. Alternatively, if we had chosen the scale on the right, where the scale is from zero to 100 percent participation, the decline is minimized almost to the point of becoming imperceptible. And remember, this graphic manipulation doesn't even begin to take into account the discussion of changes in the ways that Americans participate in the arts, much less the change in the U.S. population between 1982 and 2012.

Just as it's important to read the fine print when confronted with graphical presentations, so it is with narrative. To return to the question that motivated this discussion, several key differences between the SPPA and the ACME survey make it clear that although they cover the same topic, it would not be appropriate to make a straightforward comparison between findings from the two.

For example, the SPPA provides data about the behavior of individuals, where the ACME survey provides data about attendance at institutions. It is entirely possible that a smaller share of the population attended an event in the past year while, at the same time, overall attendance at institutions increased. To take another example, the SPPA asks about attendance at art museums or galleries, while the ACME includes data on everything from art museums to military museums to zoos. You don't have to be an expert in statistics to be an informed user of research; you do, however, have to read the fine print.

2. Correlation and Causation

If you've made it this far, you have likely heard some version of the critique/disclaimer that correlation does not imply causation. In fact, you've probably heard it enough to have stopped thinking about what it really means.

Too often, I think, we resort to one of two extremes when it comes to research findings that use correlation to suggest a causal relationship between variables. When we don't like the researcher's findings, we may bludgeon the results with our handy "Correlation does not imply causation!" club. When we do like the researcher's findings, we may casually write the warning off as a formality, a mantra that all good researchers must recite. In either case, we then go on our merry way, confident that there is no need to update our beliefs about the subject under discussion.

In reality, there is plenty to be learned from and asked about correlations between variables, and we should use "Correlation does not imply causation" as a reminder to pause and think about what we can learn. We might start by asking what exactly are the potential pitfalls in viewing the relationship a researcher presents to us. There are frequently two different critiques or disclaimers to understand in any given research finding about correlation. They are: reverse causality and endogeneity.

Reverse causality refers to the idea that the causal relationship between two variables is the reverse of what a researcher claims. For example, we might reasonably expect to observe a positive relationship between income and ownership of expensive homes (a positive relationship means that higher numbers of one variable tend to be associated with higher numbers of the other variable). We might think reverse causality was a problem if someone told us that they had concluded from this study that buying a more expensive home causes one's income to go up.

Endogeneity, on the other hand, refers to the idea that there is likely to be a third, unobserved variable that is the root cause of both the observed variables. For example, we might observe that there is a positive relationship between the number of firefighters who respond to an alarm and the damage done by the fire that leads to the alarm. If we were then told that we should take this finding as evidence that firefighters habitually cause property damage, we might respond by noting that it is more likely that larger fires cause both increased property damage and the presence of larger numbers of firefighters.

My point is that we should use such critiques or warnings wisely rather than dismissively. As the comic strip below suggests, strong correlations often mean there is something there worth paying attention to, even if we can't be completely certain about its meaning. In my experience, most researchers are at pains to account as well as possible for these kinds of threats to inference about the relationship between variables. If you can think of a reverse causality or endogeneity argument against a finding, chances are that a the researcher who is making the case has also thought about that argument and done as much as possible to address it. A simple, reflexive resort to "Correlation does not imply causation" is all too easy. 

comic strip

Correlation does not imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing "look over there." This comic originally appeared on xkcd.com and is used under a creative commons license

 

I could go on, and I think I will. Future installments of this blog post series may include musings on a variety of questions and issues you may wish to consult when encountering research findings for the first time. Missing data. Ecological fallacy. Representativeness of samples. Clustering. Indeed, there are countless factors that a good research project should take into account. And, Twain/Disraeli's quote notwithstanding, these issues are not limited to projects that involve numbers. Reading the fine print and thinking about what is behind correlations is just as important in a careful reading of history as it is in a regression. It isn't always easy, but it is usually rewarding. 

Category: 

Add new comment