Lies, damn lies and statistics

The world seems to be full of daft theories based on poor statistical analysis at the moment. In a chaotic world, humans have evolved to see and recognise patterns where they exist. Patterns and order rarely appear by chance and as we have evolved we’ve become really good at recognising order amongst disorder and deducing correlations that help us to make sense of the world around us.


Most of the world is “confined to home” due to COVID-19 at the moment and regular news sources are proving to be unreliable. News sources are increasingly driven by their own agendas and clickbait and sensationalist headlines are competing for the attention of an increasingly sceptical audience and as a result people are looking further afield for information. People are spending more time than normal browsing FaceBook and other social media and in this environment, disinformation spreads.

Add into this the fact that different groups of people are playing the blame game and as a result, those involved are keen to shift the blame elsewhere. The WHO hasn’t come out of this well and they face allegations that they may have colluded with China to downplay the severity of COVID-19 in January. Brexiteers are keen to highlight the EU’s inability to force nation states to work together in a crisis. Remainers are keen to highlight any failing of the government and use it to delay Brexit. The US is blaming China and in the meantime China is trying to draw the world’s attention away from their wet markets, biowarfare research and death figures.

Every faction is using every statistic that they can to “prove” that they are correct. In doing so, they are demonstrating confirmation bias by latching onto every theory or piece of information that appears to support their theory and discounting contradictory information.

Venn diagram showing confirmation bias
We undervalue what the facts say and overvalue anything that confirms our beliefs

All of the above is causing ridiculous theories to to circulate unchecked online. All that is needed is an audience that “wants/needs” to believe, a medium to spread information and that pre-disposition that we all have to seek patterns.

The other day I was perusing Twitter (no doubt seeking to confirm my own confirmation biases) when I came across this interesting Thread :

The UK population is about 66.5 million
The number of people who work in the NHS is about 1.5 million Therefore the number of people who don’t is about 65 million
The number of people who have died with COVID-19 is currently 10,612. 37 of those people worked in the NHS.

Therefore 10,575 people who do not work in the NHS have died. 10,575 of the 65 million who do not work for the NHS = 0.02%
37 of the 1.5 million who do work in the NHS = 0.002%
So what’s the bogus conclusion we can draw here?


That’s right. If you fail to consider any variables or wider context, it looks to be that not working for the NHS is 10x more dangerous than working for the NHS. Of course, this is complete nonsense. Fact is the demographics most likely to die are unlikely to work for the NHS.

RockboltG (via Twitter)

It’s a humorous look at how statistics can be warped to suit whichever position you want to adopt in an argument. Similar claims are made when comparing morbidity rates in dissimilar countries. Variations in demographics, urban vs rural population, relative age of population and population density all have an effect.

For those of you who are desperate to leap to some “interesting” conclusions drawn from statistics, I’d like to recommend Tyler Vigen’s “spurious correlations” website.

For example, who’d have guessed that per capita cheese consumption in America appears to correlate with the number of people who die annually through “bed sheet entanglement”?

In the meantime, enjoy your spurious correlations and conspiracy theories and remember, keep fact checking.