A few weeks ago, Mother Jones did a timeline of mass shootings in response to the spate of summer shootings. The defined their criteria, listed 61 incidents and pointed out, correctly, that most of them were committed with legal firearms.
The highlight is a map of mass shootings over the last thirty years. The map has some resemblance to Radley Balko’s famous map of botched law enforcement raids. But the use of a map and dots is where the resemblance ends. Balko was very clear that his list of incidents was not, in any way, definitive. And he did not try to parse his incomplete data to draw sketchy conclusions.
Mother Jones felt under no such compulsion.
This week, they’ve published an “analysis” of their data and drawn the conclusion that our society has more guns than ever and, perhaps related, more mass shootings. Below, I’ll detail why I think their “analysis” — and yes, I will keep using quotation marks for this — is useless, uninformative and flat-out wrong.
Before I look at the data in detail, I’ll just go through why this analysis should be dismissed out of hand: raw numbers. It is a truism of science that the more narrowly you define your sample and the more you shrink the number of data points, the less reliable your conclusions will be. If you were to analyze all gun shootings and violence over the last thirty years, you’d have hundreds of thousands of data points to base your conclusions on. You could, as I like to say, achieve Victory Through Sheer Data Volume. But when you start parsing the data down further and further, you become more prone to random variation and even bias.
Even if we take Mother Jones’ data at face value, we can see we’re dealing with less than 120 victims every year and frequently less than 20. That’s an awfully small number to be drawing conclusions from. To illustrate why, take the Virginia Tech killings. 56 people were killed or wounded. That is more than all but five entire years in their database. Something like that is simply going to swamp the statistics.
But we shouldn’t even take Mother Jones’ data at face value because it is highly suspect. First, it seems to be based on media coverage, which is not exactly an objective source and almost certainly leaves shootings out (Balko, by contrast, acknowledged this bias in his botched raid map). Everywhere, they make arbitrary cuts to exclude murders that may not fit their conclusions. They limit the sample to lone shooters, but make exceptions for Columbine and Westside. They exclude gang activity and other crimes but include the Fort Hood Shootings, which were an act of terrorism. They require the killings be in public, thus excluding men who murder their families. They require at least four deaths, therefore excluding killings that may have been shortened by intervention. They arbitrarily throw in a few spree killings.
This is simply not a representative sample. It’s cherry-picked to fit a definition, but leaves huge gaping baises all over the place. Mother Jones doesn’t even acknowledge this.
All this would be fine if you wanted to create an illustrative or representative sample. This is even fine if you want to draw some broad and overwhelming conclusions such as that most spree killers get their guns legally. But the low numbers and the biases blow up in your face when you try to do a more rigorous analysis.
And that’s precisely what happens with the “study”. They’ve narrowed the sample so far down that they are essentially looking at noise. They’ve narrowed it down so far that their criteria can critically affect their conclusions, that one incident can throw off the numbers very quickly. For example, they included the Fort Hood shooting in their sample. But that was committed by a military man. There’s no gun control provision on Earth that would deprive the military of guns. If you drop that, then 2009 suddenly becomes one of the least violent years in the sample.
And that’s a perfect illustration of the problem. One of the ways you test data samples is to take random subsamples and see if you get the same conclusion. But that’s practically impossible with only 61 data points and frequently zero for any particular year.
When you do start taking random subsamples, you immediately find that their analysis is dominated by a handful of data points: the 1984 McDonald’s shooting, the Luby’s Massacre, Columbine, Virginia Tech, Fort Hood and Aurora. Their “discovery” is that three of those six happened in the last five years. That’s … not something you can draw a conclusion from.
The thing is, even their tiny, biased sample does not support their conclusions. I plugged their data into a fitting program and found that, even with their highly biased and useless sample, the rate of injury from mass shooting is increasing at a rate of about 1.2 victims per year over the entire 30 years. Dropping Aurora reduces that to 0.7 injuries or deaths per year (I analyzed combined deaths and injuries so as not to confound the effect of improved trauma care). And the scatter in the fit is 27 victims. If you assume that there is no increase in mass shootings, you only increase the scatter to 29 victims. That’s … not something you can draw a conclusion from.
Even if you take the trend as statistically significant — which it isn’t — it would mean that, if this trend continues uninterrupted for an entire century, we will have 70-120 extra gun victims per year. That’s 2-4 times the 30 per year average of the last thirty years. That would be horrible but … it’s still a rounding error in our crime statistics and … not something you can draw a conclusion from.
(Aside: Mother Jones says that weapons in the hands of civilians rarely stop these mass shootings. This is partly, however, a result of the way they look at the data. The only include mass shootings that had four or more fatalities. By definition, if a shooting was stopped early, it is excluded from their database. So things like the Pearl High School shooting are excluded because only two people were killed thanks to the intervention of an armed principal. This also ignore that mass shootings typically take place in locations — schools, for example — where there are less likely to be weapons. To cite one concrete example of how this skews results: Suzanne Hupp, whose parents were murdered in the Luby’s shooting, had a gun but had left it in her car because Texas law forbad people from bring guns into restaurants (her interview in Penn and Teller’s show on Gun Control is a must-watch). MJ also cites citizens who were killed trying to stop a gunman. I’m not sure why that means anything since they might have been shot anyway. Even more bizarrely, they cite the recent Empire State Building incident, in which police officers wounded nine bystanders while shooting a gunman a couple of yards away, as an illustration of why civilians can not defend themselves from mass shooters.)
One of things we’ve learned here at Mathematical Malpractice Watch is that these incidents are very rarely a result of accident or ignorance. They are usually a result of someone trying to massage the data to reach a conclusion that it can not support when analyzed objectively. The simple fact is that crime and gun violence are down, way down. I’m not completely sold on the “more guns, less crime” hypothesis but it is very difficult to argue — when you take all the data — that our guns laws are creating massacres. When you look at all the data, you have a thousand times the raw number of deaths and injuries Mother Jones is analyzing.
That is something you can draw conclusions from.
But .. Mother Jones massaged the data until they got the conclusion they wanted. That’s not just standard political bullshit. That’s an organized and delibate deception.
Update: Two notes sent to me by readers. First, Nidal Hassan had a civilian weapon from a gun store. Don’t see that this affects things much. Second, CJ Carmella tipped me to this shooting. It’s not clear that the shooting was stopped by gun ownership; he may have killed everyone he intended. But it is clear that this might have muddied MJ’s study had they included it.