There is so much coronavirus research coming out every day that it’s hard to keep track of even a tiny bit of it, never mind everything. The same is true for COVID-19 news — there’s just so much new information that it’s hard to stay up to date with it all, no matter how hard you might try.
It’s a bit like fighting a raging wildfire using your SuperSoaker 300. It’s never going to work, and you’ll eventually get burned.
One problem that keeps coming up is something that every scientist knows, but can be very counter-intuitive: the ecological fallacy. It’s present in arguments both for and against masks, it has undermined much of the discussion about vitamin D and coronavirus, and it’s just generally a problem for many of the points made in the media about COVID-19.
So what is the ecological fallacy, and why is it a problem? Let’s dig in.
While it may sound like biology, the ecological fallacy is actually about large populations rather than the biophysical environment. It is so simple and yet such and easy trap that pretty much everyone has fallen into it at one point or another.
The basic idea of the fallacy is this: you cannot directly infer the properties of individuals from the average of a group. Sounds complicated, but what that means is that if you measure something about lots of people — say, height — you can’t take the average measurement as an indication of any particular person’s status.
There’s a really simple example of this to do with means, or averages. Imagine you’ve got two groups of ten people, A and B. Group A has an average height of 170cm, and group B has an average height of 168cm. If you randomly select one person from each group, who is more likely to be taller, someone from group A or B?
The intuitive reaction is to say that someone from A is going to be taller than B, because the mean height is higher. However, this is not necessarily true. You can have a mean height of 170cm caused by two 200cm giants and eight 162.5cm people, and a mean of 168cm with six 170cm people and four 165cm people. In this case, 80% of group A is shorter than everyone in group B, which means that you’ll almost always get a taller person in group B if you pick randomly.
In other words, the average of a group isn’t always representative of the individuals.
That’s the ecological fallacy in a nutshell. There are dozens of examples, many of them to do with countries and states. It commonly pops up in nutritional epidemiology — if we do a study and find that people who eat vegetarian diets are more likely to be depressed, it actually tells us very little about an individual vegetarian and their risk of depression. Similarly, even though people who eat more red meat tend to be less healthy, we can’t necessarily say that at an individual level eating more red meat is a good or bad thing.
Fallacies and COVID-19
So how much does this affect coronavirus evidence in the news? It turns out, quite a lot. While headlines abound about countries with more vitamin D doing better, we still have very little idea about whether vitamin D actually does anything for people with the disease. Similarly, the number of stories about why you should wear masks that come down to a study comparing one region where people wore masks and one where they didn’t is almost endless at this point. If it was as easy as comparing two places and drawing a conclusion about a single policy, we would’ve had an answer months ago.
Instead, we are still seeing daily debates about masks in the news, because this stuff is pretty complex.
And the issues don’t stop there. Have you heard that lockdowns are good because death rates are lower in one place than another? Have you heard that they are bad with almost the same argument used? Both of these are examples of the ecological fallacy, where people are aggregating both deaths and pandemic responses at a country level, without acknowledging that individual areas within nations have very different pandemic experiences and this might be important in interpreting their overall result.
A very common example of this was the death rate per million when used at the start of the pandemic. This was very misleading, because it always made big countries look good (they had lots of coronavirus but, due to the dynamics of exponential growth, not many infections per capita). Now that we’ve progressed to a later stage of COVID-19, it’s becoming clear that those places that looked really good on the deaths/million scale may actually have been doing badly all this time. Aggregating at the population level made it seem like some places weren’t in trouble, even though they had a worrying explosion of cases.
The point here is not that we cannot draw any conclusions about COVID-19 from country-level data, it’s that these comparisons are complex and can be misleading. It may seem intuitive to say that, if masks appear to reduce an area’s increase in COVID-19 cases by 40% that it will do the same for your risk of the disease, but that’s simply not true.
Ultimately, drawing individual conclusions from these broad strokes is incredibly difficult, and people spend decades learning how to do it. The ecological fallacy is just one issue in our interpretation of coronavirus evidence, but it does crop up often.
Next time you read a headline, before you upend your life, have a think about whether it actually has much meaning to you.
It might just be another example of the ecological fallacy.