conjugateprior

On the use and abuse of weasels in science journalism


Dean Burnett writes a column in Guardian, sometimes about science but more entertainingly on pseudo-, wannabe-, and not-actually- science. Most of the time this is good BS-shovelling fun and I recommend it. Unfortunately today we get some ill-considered overreach under the guise of shovelling.

The subject is a silly equation purporting to define how depressing any day of the year is, and thereby to identify the most depressing one. It is sufficiently silly that it doesn’t deserve a link, has no redeeming features and if you’ve not read it yet you’re just lucky. He’s right. It is nonsense.

The arguments are more interesting, if a bit alarming. To put it bluntly: if they were sound they would sink all regression models about any social issue of any interest to anybody ever.

Let’s get this out in the open first: We’re talking about regression equations - the ones that tell you how the average level of something changes in different circumstances, if you happen to believe them. Here’s Burnett:

The equation itself is farcical. It includes variables like “time since Christmas”, “weather”, “debt level”, “motivational levels”, “time since failure to keep new year’s resolution” and numerous other things that aren’t part of the metric system. Even if most of these weren’t nonsensical measurements (how do you determine the motivation of everyone in the population?), they’re not compatible. How do you quantifiably combine “time since Christmas” with “weather”? You can’t.

There then follows a joke about a weasel. I like weasels. And stoats too, though I find them hard to tell apart. (Qualified biologists assured my dad that “weasels are weasely recognised, but stoats are stoatally different”. I pass on this information as a public service.) But that’s about as relevant as the joke is to the quote above. Let’s instead consider these claims as if they were serious criticisms, starting with the easy ones.

‘Time since Christmas’ seems like an eminently reasonable variable. ‘Time since X’ and ‘Time to X’ are building blocks of medical risk analysis and I’ve seen ‘time since the last election’ applied to public opinion data about governments. Not metric, I suppose, but so what? Not every quantity gets to have a reference example archived in Paris.

‘Debt level’ is similarly straightforward. The Guardian quite often writes about levels of personal and household debt and their social consequences. It might be surprised to hear it was reporting about something definitionally nonsensical.

OK, how about ‘motivational levels […] (how do you determine the motivation of everyone in the population)’. Glad you asked. You don’t. That’s because you’re interested in average motivational level and for that you only need a representative sample. Fortunately the basic recipe for making such a thing and measuring this variable turns up shortly; paragraph 7, meet paragraphs 10, 11 and 12. But that’s probably enough about ‘nonsensical measurements’. How about their combination?

Apparently these variables are ‘not compatible’. What could that possibly mean? Burnett would perhaps agree that I (yes me, personally) have a level of motivation and a level of debt, and that there is a time since I failed some new year’s resolution, that there’s weather of some kind outside, etc. Fine.

All such equations assert is that folk who share a particular combination of attributes have, on average, the same level of happiness. So are these variables ‘compatible’? To the extent they pick out a bunch of people from the population, sure. What’s to go wrong? So far we’re just imagining possible categories of people.

Consequently, now is a good time to “quantifiably combine ‘time since Christmas’ with ‘weather’“. How about this: For every combination of times since Christmas and types of weather there’s an average level of happiness in the population. For combinations that don’t have anybody in them there’s a level of happiness that people would have had if it had been further from or closer to christmas, and there had been different weather. If the equation describes these levels, it’s right. If it doesn’t, it isn’t. Even when it is we are never quite sure of the levels because we get our information from a sample. Welcome to multivariate regression analysis. Hot fast sandy weasels as far as the eye can see.

But what about all the other things that affect mood that aren’t in the equation? Burnett assumes, very reasonably, that every year people get their mood pushed up or down by a lot of different things. Moreover

[…] The next year people would experience a whole different set of variables that affect mood. So if you’re looking for a persistent depressing day, you’d have to repeat this procedure for the next 30 years or so (at a guess). Of course, using the same subjects would help, but they’d grow older (and many would die) over this time scale, so you’d have to find some way of introducing new subjects while keeping the data consistent.

Indeed. That ‘some way’ is called longitudinal and panel data analysis, or sometimes time-series cross-sectional if you do political science. It’s the sort of thing that keeps econometricians awake at nights. Fortunately, just across the water from Burnett’s home university is the Bristol University Centre for Multilevel Modelling which specialises in this sort of thing, in the event of a local shortage. In short, social scientists build these sorts of model all the time. Would you like your hot sandy weasels with fixed or random effects?

Of all this we should ask two questions.

First, how could this strategy have seemed so obviously ridiculous to an otherwise clued-up science blogger? Well, the initial trigger was an article that was very ridiculous, so maybe some of the ridiculous leaked. This is my best guess. Another reason might be this sort of thing, where the answer to question ‘how hard could this be’ is always ‘provably impossible’ or ‘trivially easy’ (and sometimes both) when looked at from natural sciences.

Second, and more importantly, is it actually a problem? Perhaps a little critical overreach only serves to more throughly debunk the nonsense.

Maybe. But I doubt it.

If it’s not clear to somebody why the conclusions from one cross-sectional study on, say, abortion and breast cancer, or on autism and MMR are different from a stack of longitudinal ones on the same topic that contradict it, it’s just not enough to say ‘scientists (now) believe’. Some such studies just have more problems than others. Hint: it’s not because the variables “aren’t compatible”.

Worse, hearing “scientists (now) believe” sounds just like an argument from authority. Because it is. So they count the studies or they pick the ones they like.

Many of us are trying to make science more accessible to the general public by making journal articles easily accessible and data available for download. If we succeed then that public is going to need a clear idea of how to evaluate the things it gets hold of. Actually, I suspect they’re already pretty good; I’ll wager most people recognise the article Burnett is criticising for the fluff that it is. But they don’t necessarily know why. That’s an good task for a science writer.

Archived comments

From the Internet Archive


If you found this helpful...

ko-fi page

License

This page has an open-source license (Creative Commons BY-NC-ND)

Creative Commons License BY-NC-ND