A Fishy Story

Clearing out some old posts.

A while ago, I encountered a story on Amy Alkon’s site about a man fooled into fathering a child:

Here’s how it happened, according to Houston Press. Joe Pressil began dating his girlfriend, Anetria, in 2005. They broke up in 2007 and, three months later, she told him she was pregnant with his child. Pressil was confused, since the couple had used birth control, but a paternity test proved that he was indeed the father. So Pressil let Anetria and the boys stay at his home and he agreed to pay child support.
Fast forward to February of this year, when 36-year-old Pressil found a receipt – from a Houston sperm bank called Omni-Med Laboratories – for “cryopreservation of a sperm sample” (Pressil was listed as the patient although he had never been there). He called Omni-Med, which passed him along to its affiliated clinic Advanced Fertility. The clinic told Pressil that his “wife” had come into the clinic with his semen and they performed IVF with it, which is how Anetria got pregnant.

The big question, of course, is how exactly did Anetria obtain Pressil’s sperm without him knowing about it? Simple. She apparently saved their used condoms. Gag. (Anetria denies these claims.) [tagbox tag=”IVF”]

“I couldn’t believe it could be done. I was very, very devastated. I couldn’t believe that this fertility clinic could actually do this without my consent, or without my even being there,” Pressil said, adding that artificial insemination is against his religious beliefs. “That’s a violation of myself, to what I believe in, to my religion, and just to my manhood,” Pressil said.

I’ve now seen this story show up on a couple of other sites. The only links in Google are for the original claim and her denial. I can’t find out how it was resolved. But I suspect his claim was dismissed. The reason I suspect this is because his story is total bullshit.

Here’s a conversation that has never happened:

Patient: “Hi, I have this condom full of sperm. God knows how I got it or who it belongs to. Can you harvest my eggs and inject this into them?”

Doctor: “No problem!”

I’ve been through IVF (Ben was conceived naturally after two failed cycles). It is a very involved process. We had to have interviews, then get tests for venereal diseases and genetic conditions. I then had to show up and make my donation either on site or in nearby hotel. And no, I was not allowed to bring in a condom. Condoms contain spermicides and lubricants that murder sperm and latex is not sperm’s friend. Even in a sterile container, sperm cells don’t last very long unless they are placed in a special refrigerator. Freezing sperm is a slow process that takes place in a solution that keeps the cells from shattering from ice crystal formation.

And that’s only the technical side of the story. There’s also the legal issue that no clinic is going to expose themselves to a potential multi-million dollar lawsuit by using the sperm of a man they don’t have a consent form from.

So, no, you can’t just have a man fill a condom, throw it in your freezer and get it injected into your eggs. It doesn’t work that way. This is why I believe the woman’s lawyer, who claims Pressil agreed to IVF and signed consent forms.

I’ve seen the frozen sperm canard come up on TV shows and movies from time to time. It annoys me. This is something conjured up by people who haven’t done their research.

Mathematical Malpractice Watch: Non-Citizen Voters

Hmmm:

How many non-citizens participate in U.S. elections? More than 14 percent of non-citizens in both the 2008 and 2010 samples indicated that they were registered to vote. Furthermore, some of these non-citizens voted. Our best guess, based upon extrapolations from the portion of the sample with a verified vote, is that 6.4 percent of non-citizens voted in 2008 and 2.2 percent of non-citizens voted in 2010.

The authors go on to speculate that non-citizen voting could have been common enough to swing Al Franken’s 2008 election and possibly even North Carolina for Obama in 2008. Non-citizens vote overwhelmingly Democrat.

I do think there is a point here which is that non-citizens may be voting in our elections, which they are not supposed to do. Interestingly, photo ID — the current policy favored by Republicans — would do little to address this as most of the illegal voters had ID. The real solution … to all our voting problems … would be to create a national voter registration database that states could easily consult to verify someone’s identity, citizenship, residence and eligibility status. But this would be expensive, might not work and would very likely require a national ID card, which many people vehemently oppose.

However …

The sample is very small: 21 non-citizens voting in 2008 and 8 in 2010. This is intriguing but hardly indicative. It could be a minor statistical blip. And there have been critiques that have pointed out that this is based on a … wait for it … web survey. So the results are highly suspect. It’s likely that fair number of these non-citizen voters are, in fact, non-correctly-filling-out-a-web-survey voters.

To their credit, the authors acknowledge this and say that while it is possible non-citizens swung the Franken Election (only 0.65% would have had to vote), speculating on other races is … well, speculation.

So far, so good.

The problem is how the blogosphere is reacting to it. Conservative sites are naturally jumping on this while liberals are talking about the small number statistics. But those liberal sites are happy to tout small numbers when it’s, say, a supposed rise in mass shootings.

In general, I lean toward to the conservatives on this. While I don’t think voter fraud is occurring on the massive scale they presume, I do think it’s more common than the single-digit or double-digit numbers liberals like to hawk. Those numbers are themselves based on small studies in environments where voter ID is not required. We know how many people have been caught. But assuming that represents the limit of the problem is like assuming the number of speeders on a highway is equal to the number of tickets that are given out. One of the oft-cited studies is from the President’s Commission on Election Administration, which was mostly concerned with expanding access, not tracking down fraud.

Here’s the thing. While I’m convinced the number of fraudulent votes is low, I note that, every time we discuss this, that number goes up. It used to be a handful. Now it’s a few dozen. This study hints it could be hundreds, possibly thousands. There are 11 million non-citizens living in this country (including my wife). What these researchers are indicating is that, nationally, their study could mean a many thousands of extra votes for Democrats. Again, their study is very small and likely subject to significant error (as all web surveys are). It’s also likely the errors bias high. But even if they have overestimated the non-citizen voting by a factor of a hundred, that still means a few thousands incidents of voter fraud. That’s getting to the point where this may be a concern, no?

Do I think this justifies policy change? I don’t think a web-survey of a few hundred people justifies anything. I do think this indicates the issue should be studied properly and not just dismissed out of hand because only a few dozen fake voters have actually been caught.

The Latest Plagiarism Kerfuffles

Over the last few years, a number of reporters and writers have turned out to be serial plagiarists. Oh, they don’t admit this. They’ll say they forgot to put in quotemarks or that rules don’t apply to them. But if you and I did that, we’d be kicked out of school. Or maybe not.

The most recent accusation is CJ Werleman. His excuse has crumbled now that researchers have dug up over a dozen liftings of text from other people. But I want to focus on his excuse because it is illustrative:

The Harris zombies now accuse me of plagiarism. From 5 books & 100+ op-eds, they cite 2 common cliches and two summaries of cited studies

This sounds reasonable. After all, there are only so many ways you can state the same facts. And when you look at the quotes, if you were of a generous disposition, you might accept this response. The quotes aren’t completely verbatim. Maybe he did just happen to phrase things the same way other writers did.

But I find this excuse unlikely. In a great post on plagiarism, McArdle writes the following:

A while back, Terry Teachout, the Wall Street Journal’s drama critic, pointed out something fascinating to me: If you type even a small fragment of your own work into Google, as few as seven words, with quotation marks around the fragment to force Google to only search on those words in that order, then you are likely to find that you are the only person on the Internet who has ever produced that exact combination of words. Obviously this doesn’t work with boilerplate like “GE rose four and a quarter points on stronger earnings”, or “I love dogs,” but in general, it’s surprisingly true.

I’ve tested this and it is true. With a lot of my posts, if I type in a non-generic line, the only site that comes up is mine. In fact, verbatim Google searches are a good way to find content scrapers and plagiarists.

Whenever I site anyone on the internet, I will link them, usually quote them and then, if necessary, summarize or rephrase the other points they are making. I try hard to avoid simply rewriting what they said like I’m a fifth grader turning in a book report. So when you hear someone using the excuse that similar ideas require similar phrasing, it’s largely baloney. If two passage of text are nearly identical, it’s very likely that one was copied from the other.

I’ve become more sensitive to plagiarism since I’ve become a victim over the last ten years. I’ve had content scraped, I’ve had my ideas presented as though they were someone else’s and I’ve had outright word-for-word copying (on a now defunct story site). It’s difficult to describe just how dirty being plagiarized makes you feel. I even shied away … at first … from making accusations because I was so embarrassed. Here’s what I wrote the first time it happened:

Plagiarism is not just stealing someone’s words. It is stealing their mind. It is a cruel violation. The hard work and original thought of one person is stolen by a second. The people who have lost their careers because of plagiarism have deserved everything they’ve gotten and I am now determined, more than ever, to make sure I quote people properly and always give credit where it’s due.

Plagiarists need to be called out. Words are the currency of writers and, for many, how they make their living. Plagiarizing someone is no different than stealing their car or cleaning out their bank account. In fact, I would argue that it’s a lot worse.

Mother Jones Revisited

A couple of years ago, Mother Jones did a study of mass shootings which attempted to characterize these awful events. Some of their conclusions were robust — such as the finding that most mass shooters acquire their guns legally. However, their big finding — that mass shootings are on the rise — was highly suspect.

Recently, they doubled down on this, proclaiming that Harvard researchers have confirmed their analysis1. The researchers use an interval analysis to look at the time differences between mass shootings and claim that the recent run of short intervals proves that the mass shootings have tripled since 2011.2

Fundamentally, there’s nothing wrong with the article. But practically, there is: they have applied a sophisticated technique to suspect data. This technique does not remove the problems of the original dataset. If anything, it exacerbates them.

As I noted before, the principle problem with Mother Jones’ claim that mass shootings were increasing was the database. It had a small number of incidents and was based on media reports, not by taking a complete data set and paring it down to a consistent sample. Incidents were left out or included based on arbitrary criteria. As a result, there may be mass shootings missing from the data, especially in the pre-internet era. This would bias the results.

And that’s why the interval analysis is problematic. Interval analysis itself is useful. I’ve used it myself on variable stars. But there is one fundamental requirement: you have to have consistent data and you have to account for potential gaps in the data.

Let’s say, for example, that I use interval analysis on my car-manufacturing company to see if we’re slowing down in our production of cars. That’s a good way of figuring out any problems. But I have to account for the days when the plant is closed and no cars are being made. Another example: let’s say I’m measuring the intervals between brightness peaks of a variable star. It will work well … if I account for those times when the telescope isn’t pointed at the star.

Their interval analysis assumes that the data are complete. But I find that suspect given the way the data were collected and the huge gaps and massive dispersion of the early intervals. The early data are all over the place, with gaps as long as 500-800 days. Are we to believe that between 1984 and 1987, a time when violent crime was surging, that there was only one mass shooting? The more recent data are far more consistent with no gap greater than 200 days (and note how the data get really consistent when Mother Jones began tracking these events as they happened, rather than relying on archived media reports).

Note that they also compare this to the average of 172 days. This is the basis of their claim that the rate of mass shootings has “tripled”. But the distribution of gaps is very skewed with a long tail of long intervals. The median gap is 94 days. Using the median would reduce their slew of 14 straight below-average points to 11 below-median points. It would also mean that mass shootings have increased by only 50%. Since 1999, the median is 60 days (and the average 130). Using that would reduce their slew of 14 straight short intervals to four and mean that mass shootings have been basically flat.

The analysis I did two years ago was very simplistic — I looked at victims per year. That approach has its flaws but it has one big strength — it is less likely to be fooled by gaps in the data. Huge awful shootings dominate the number of victims and those are unlikely to have been missed in Mother Jones’ sample.

Here is what you should do if you want to do this study properly. Start with a uniform database of shootings such as those provided by law enforcement agencies. Then go through the incidents, one by one, to see which ones meet your criteria.

In Jesse Walker’s response to Mother Jones, in which he graciously quotes me at length, he notes that a study like this has been done:

The best alternative measurement that I’m aware of comes from Grant Duwe, a criminologist at the Minnesota Department of Corrections. His definition of mass public shootings does not make the various one-time exceptions and other jerry-riggings that Siegel criticizes in the Mother Jones list; he simply keeps track of mass shootings that took place in public and were not a byproduct of some other crime, such as a robbery. And rather than beginning with a search of news accounts, with all the gaps and distortions that entails, he starts with the FBI’s Supplementary Homicide Reports to find out when and where mass killings happened, then looks for news reports to fill in the details. According to Duwe, the annual number of mass public shootings declined from 1999 to 2011, spiked in 2012, then regressed to the mean.

(Walker’s article is one of those “you really should read the whole thing” things.)

This doesn’t really change anything I said two year ago. In 2012, we had an awful spate of mass shootings. But you can’t draw the kind of conclusions Mother Jones wants to from rare and awful incidents. And it really doesn’t matter what analysis technique you use.


1. That these researchers are from Harvard is apparently a big deal to Mother Jones. As one of my colleague used to say, “Well, if Harvard says it, it must be true.”

2. This is less alarming than it sounds. Even if we take their analysis at face value, we’re talking about six incidents a year instead of two for a total of about 30 extra deaths or about 0.2% of this country’s murder victims or about the same number of people that are crushed to death by their furniture. We’re also talking about two years of data and a dozen total incidents.