Archive for the ‘Science and Edumacation’ Category

A Fishy Story

Thursday, October 30th, 2014

Clearing out some old posts.

A while ago, I encountered a story on Amy Alkon’s site about a man fooled into fathering a child:

Here’s how it happened, according to Houston Press. Joe Pressil began dating his girlfriend, Anetria, in 2005. They broke up in 2007 and, three months later, she told him she was pregnant with his child. Pressil was confused, since the couple had used birth control, but a paternity test proved that he was indeed the father. So Pressil let Anetria and the boys stay at his home and he agreed to pay child support.
Fast forward to February of this year, when 36-year-old Pressil found a receipt – from a Houston sperm bank called Omni-Med Laboratories – for “cryopreservation of a sperm sample” (Pressil was listed as the patient although he had never been there). He called Omni-Med, which passed him along to its affiliated clinic Advanced Fertility. The clinic told Pressil that his “wife” had come into the clinic with his semen and they performed IVF with it, which is how Anetria got pregnant.

The big question, of course, is how exactly did Anetria obtain Pressil’s sperm without him knowing about it? Simple. She apparently saved their used condoms. Gag. (Anetria denies these claims.) [tagbox tag="IVF"]

“I couldn’t believe it could be done. I was very, very devastated. I couldn’t believe that this fertility clinic could actually do this without my consent, or without my even being there,” Pressil said, adding that artificial insemination is against his religious beliefs. “That’s a violation of myself, to what I believe in, to my religion, and just to my manhood,” Pressil said.

I’ve now seen this story show up on a couple of other sites. The only links in Google are for the original claim and her denial. I can’t find out how it was resolved. But I suspect his claim was dismissed. The reason I suspect this is because his story is total bullshit.

Here’s a conversation that has never happened:

Patient: “Hi, I have this condom full of sperm. God knows how I got it or who it belongs to. Can you harvest my eggs and inject this into them?”

Doctor: “No problem!”

I’ve been through IVF (Ben was conceived naturally after two failed cycles). It is a very involved process. We had to have interviews, then get tests for venereal diseases and genetic conditions. I then had to show up and make my donation either on site or in nearby hotel. And no, I was not allowed to bring in a condom. Condoms contain spermicides and lubricants that murder sperm and latex is not sperm’s friend. Even in a sterile container, sperm cells don’t last very long unless they are placed in a special refrigerator. Freezing sperm is a slow process that takes place in a solution that keeps the cells from shattering from ice crystal formation.

And that’s only the technical side of the story. There’s also the legal issue that no clinic is going to expose themselves to a potential multi-million dollar lawsuit by using the sperm of a man they don’t have a consent form from.

So, no, you can’t just have a man fill a condom, throw it in your freezer and get it injected into your eggs. It doesn’t work that way. This is why I believe the woman’s lawyer, who claims Pressil agreed to IVF and signed consent forms.

I’ve seen the frozen sperm canard come up on TV shows and movies from time to time. It annoys me. This is something conjured up by people who haven’t done their research.

Mother Jones Revisited

Saturday, October 18th, 2014

A couple of years ago, Mother Jones did a study of mass shootings which attempted to characterize these awful events. Some of their conclusions were robust — such as the finding that most mass shooters acquire their guns legally. However, their big finding — that mass shootings are on the rise — was highly suspect.

Recently, they doubled down on this, proclaiming that Harvard researchers have confirmed their analysis1. The researchers use an interval analysis to look at the time differences between mass shootings and claim that the recent run of short intervals proves that the mass shootings have tripled since 2011.2

Fundamentally, there’s nothing wrong with the article. But practically, there is: they have applied a sophisticated technique to suspect data. This technique does not remove the problems of the original dataset. If anything, it exacerbates them.

As I noted before, the principle problem with Mother Jones’ claim that mass shootings were increasing was the database. It had a small number of incidents and was based on media reports, not by taking a complete data set and paring it down to a consistent sample. Incidents were left out or included based on arbitrary criteria. As a result, there may be mass shootings missing from the data, especially in the pre-internet era. This would bias the results.

And that’s why the interval analysis is problematic. Interval analysis itself is useful. I’ve used it myself on variable stars. But there is one fundamental requirement: you have to have consistent data and you have to account for potential gaps in the data.

Let’s say, for example, that I use interval analysis on my car-manufacturing company to see if we’re slowing down in our production of cars. That’s a good way of figuring out any problems. But I have to account for the days when the plant is closed and no cars are being made. Another example: let’s say I’m measuring the intervals between brightness peaks of a variable star. It will work well … if I account for those times when the telescope isn’t pointed at the star.

Their interval analysis assumes that the data are complete. But I find that suspect given the way the data were collected and the huge gaps and massive dispersion of the early intervals. The early data are all over the place, with gaps as long as 500-800 days. Are we to believe that between 1984 and 1987, a time when violent crime was surging, that there was only one mass shooting? The more recent data are far more consistent with no gap greater than 200 days (and note how the data get really consistent when Mother Jones began tracking these events as they happened, rather than relying on archived media reports).

Note that they also compare this to the average of 172 days. This is the basis of their claim that the rate of mass shootings has “tripled”. But the distribution of gaps is very skewed with a long tail of long intervals. The median gap is 94 days. Using the median would reduce their slew of 14 straight below-average points to 11 below-median points. It would also mean that mass shootings have increased by only 50%. Since 1999, the median is 60 days (and the average 130). Using that would reduce their slew of 14 straight short intervals to four and mean that mass shootings have been basically flat.

The analysis I did two years ago was very simplistic — I looked at victims per year. That approach has its flaws but it has one big strength — it is less likely to be fooled by gaps in the data. Huge awful shootings dominate the number of victims and those are unlikely to have been missed in Mother Jones’ sample.

Here is what you should do if you want to do this study properly. Start with a uniform database of shootings such as those provided by law enforcement agencies. Then go through the incidents, one by one, to see which ones meet your criteria.

In Jesse Walker’s response to Mother Jones, in which he graciously quotes me at length, he notes that a study like this has been done:

The best alternative measurement that I’m aware of comes from Grant Duwe, a criminologist at the Minnesota Department of Corrections. His definition of mass public shootings does not make the various one-time exceptions and other jerry-riggings that Siegel criticizes in the Mother Jones list; he simply keeps track of mass shootings that took place in public and were not a byproduct of some other crime, such as a robbery. And rather than beginning with a search of news accounts, with all the gaps and distortions that entails, he starts with the FBI’s Supplementary Homicide Reports to find out when and where mass killings happened, then looks for news reports to fill in the details. According to Duwe, the annual number of mass public shootings declined from 1999 to 2011, spiked in 2012, then regressed to the mean.

(Walker’s article is one of those “you really should read the whole thing” things.)

This doesn’t really change anything I said two year ago. In 2012, we had an awful spate of mass shootings. But you can’t draw the kind of conclusions Mother Jones wants to from rare and awful incidents. And it really doesn’t matter what analysis technique you use.


1. That these researchers are from Harvard is apparently a big deal to Mother Jones. As one of my colleague used to say, “Well, if Harvard says it, it must be true.”

2. This is less alarming than it sounds. Even if we take their analysis at face value, we’re talking about six incidents a year instead of two for a total of about 30 extra deaths or about 0.2% of this country’s murder victims or about the same number of people that are crushed to death by their furniture. We’re also talking about two years of data and a dozen total incidents.

Now You See the Bias Inherent in the System

Thursday, September 11th, 2014

When I was a graduate student, one of the big fields of study was the temperature of the cosmic microwave background. The studies were converging on a value of 2.7 degrees with increasing precision. In fact, they were converging a little too well, according to one scientist I worked with.

If you measure something like the temperature of the cosmos, you will never get precisely the right answer. There is always some uncertainty (2.7, give or take a tenth of a degree) and some bias (2.9, give or take a tenth of a degree). So the results should span a range of values consistent with what we know about the limitations of the method and the technology. This scientist claimed that the range was too small. As he said, “You get the answer. And if it’s not the answer you wanted, you smack your grad student and tell him to do it right next time.”

It’s not that people were faking the data or tiling their analysis. It’s that knowing the answer in advance can cause subtle confirmation biases. Any scientific analysis is going to have a bias — an analytical or instrumentation effect that throws off the answer. A huge amount of work is invested in ferreting out and correcting for these biases. But there is a danger when a scientist thinks he knows the answer in advance. If they are off from the consensus, they might pore through their data looking for some effect that biased the results. But if they are close, they won’t look as carefully.

Megan McArdle flags two separate instances of this in the social sciences. The first is the long-standing claim that conservatives are authoritarian while liberals are not:

Jonathan Haidt, one of my favorite social scientists, studies morality by presenting people with scenarios and asking whether what happened was wrong. Conservatives and liberals give strikingly different answers, with extreme liberals claiming to place virtually no value at all on things like group loyalty or sexual purity.

In the ultra-liberal enclave I grew up in, the liberals were at least as fiercely tribal as any small-town Republican, though to be sure, the targets were different. Many of them knew no more about the nuts and bolts of evolution and other hot-button issues than your average creationist; they believed it on authority. And when it threatened to conflict with some sacred value, such as their beliefs about gender differences, many found evolutionary principles as easy to ignore as those creationists did. It is clearly true that liberals profess a moral code that excludes concerns about loyalty, honor, purity and obedience — but over the millennia, man has professed many ideals that are mostly honored in the breach.

[Jeremy] Frimer is a researcher at the University of Winnipeg, and he decided to investigate. What he found is that liberals are actually very comfortable with authority and obedience — as long as the authorities are liberals (“should you obey an environmentalist?”). And that conservatives then became much less willing to go along with “the man in charge.”

Frimer argues that conservatives tend to support authority because they think authority is conservative; liberals tend to oppose it for the same reason. Liberal or conservative, it seems, we’re all still human under the skin.

Exactly. The deference to authority for conservatives and liberals depends on who is wielding said authority. If it’s a cop or a religious figure, conservatives tend to trust them and liberals are skeptical. If it’s a scientist or a professor, liberals tend to trust them and conservatives are rebellious.

Let me give an example. Liberals love to cite the claim that 97% of climate scientists agree that global warming is real. In fact, this week they are having 97 hours of consensus where they have 97 quotes from scientists about global warming. But what is this but an appeal to authority? I don’t care if 100% of scientists agree on global warming: they still might be wrong. If there is something wrong with the temperature data (I don’t think there is) then they are all wrong.

The thing is, that appeal to authority does scrape something useful. You should accept that global warming is very likely real. But not because 97% of scientists agree. The “consensus” supporting global warming is about as interesting as a “consensus” opposing germ theory. It’s the data supporting global warming that is convincing. And when scientists fall back on the data, not their authority, I become more convinced.

If I told liberals that we should ignore Ferguson because 97% of cops think the shooting justified, they wouldn’t say, “Oh, well that settles it.” If I said that 97% of priests agreed that God exists, they wouldn’t say, “Oh, well that settles it.” Hell, this applies even to things that aren’t terribly controversial. Liberals are more than happy to ignore the “consensus” on the unemployment effects of minimum wage hikes or the safety of GMO crops.

I’m drifting from the point. The point is that the studies showing the conservatives are more “authoritarian” were biased. They only asked about certain authority figures, not all of them. And since this was what the mostly liberal social scientists expected, they didn’t question it. McArdle gets into this in her second article, which takes on the claim that conservative views come from “low-effort thought” based on two small studies.

In both studies, we’re talking about differences between groups of 18 to 19 students, and again, no mention of whether the issue might be disinhibition — “I’m too busy to give my professor the ‘right’ answer, rather than the one I actually believe” — rather than “low-effort thought.”

I am reluctant to make sweeping generalizations about a very large group of people based on a single study. But I am reluctant indeed when it turns out those generalizations are based on 85 drunk people and 75 psychology students.

I do not have a scientific study to back me up, but I hope that you’ll permit me a small observation anyway: We are all of us fond of low-effort thought. Just look at what people share on Facebook and Twitter. We like studies and facts that confirm what we already believe, especially when what we believe is that we are nicer, smarter and more rational than other people. We especially like to hear that when we are engaged in some sort of bruising contest with those wicked troglodytes — say, for political and cultural control of the country we both inhabit. When we are presented with what seems to be evidence for these propositions, we don’t tend to investigate it too closely. The temptation is common to all political persuasions, and it requires a constant mustering of will to resist it.

One of these studies found that drunk students were more likely to express conservative views than sober ones and concluded that this was because it was easier to think conservatively when alcohol is inhibiting your through process. The bias there is simply staggering. They didn’t test the students before they started drinking (heavy drinkers might skew conservative). They didn’t consider social disinhibition — which I have mentioned in studies claiming that hungry or “stupid” men like bigger breasts. This was a study designed with its conclusion in mind.

All sciences are in danger of confirmation bias. My advisor was very good about side-stepping it. When we got the answer we expected, he would say, “something is wrong here” and make us go over the data again. But the social sciences seem more subject to confirmation bias for various reasons: the answers in the social sciences are more nebulous, the biases are more subtle, the “observer effect” is more real and, frankly, some social scientists lack the statistical acumen to parse data properly (see the Hurricane study discussed earlier this year). But I also think there is an increased danger because of the immediacy of the issues. No one has a personal stake in the time-resolved behavior of an active galactic nucleus. But people have very personal stakes in politics, economics and sexism.

Megan also touches on what I’ve dubbed the Scientific Peter Principle: that a study garnering enormous amounts of attention is likely erroneous. The reason is that when you do something wrong in a study, it will usually manifest as a false result, not a null result. Null results are usually the result of doing your research right, not doing it wrong. Take the sexist hurricane study earlier this year. Had the scientists done their research correctly: limiting their data to post-1978 or doing a K-S test, they would have found no connection between the femininity of hurricane names and their deadliness. As a result, we would never have heard about it. In fact, other scientists may have already done that analysis and either not bothered to publish it or publish it quietly.

But because they did their analysis wrong — assigning an index to the names, only sub-sampling the data in ways that supported the hypothesis — they got a result. And because they had a surprising result, they got publicity.

This happens quite a bit. The CDC got lots of headlines when they exaggerated the number of obesity deaths by a factor of 14. Scottish researchers got attention when they erroneously claimed that smoking bans were saving lives. The EPA got headlines when they deliberately biased their analysis to claim that second-hand smoke was killing thousands.

Cognitive bias, in combination with the Scientific Peter Principle, is incredibly dangerous.

Mathematical Malpractice Watch: Torturing the Data

Thursday, August 28th, 2014

There’s been a kerfuffle recently about a supposed CDC whistleblower who has revealed malfeasance in the primary CDC study that refuted the connection between vaccines and autism. Let’s put aside that the now-retracted Lancet study the anti-vaxxers tout as the smoking gun was a complete fraud. Let’s put aside that other studies have reached the same conclusion. Let’s just address the allegations at hand, which include a supposed cover up. These allegations are in a published paper (now under further review) and a truly revolting video from Andrew Wakefield — the disgraced author of the fraudulent Lancet study that set off this mess — that compares this “cover-up” to the Tuskegee experiments.

According to the whistle-blower, his analysis shows that while most children do not have an increased risk of autism (which, incidentally, discredits Wakefield’s study), black males vaccinated before 36 months show a 240% increased risk (not 340, as has been claimed). You can catch the latest from Orac. Here’s the most important part:

So is Hooker’s result valid? Was there really a 3.36-fold increased risk for autism in African-American males who received MMR vaccination before the age of 36 months in this dataset? Who knows? Hooker analyzed a dataset collected to be analyzed by a case-control method using a cohort design. Then he did multiple subset analyses, which, of course, are prone to false positives. As we also say, if you slice and dice the evidence more and more finely, eventually you will find apparent correlations that might or might not be real.

In other words, what he did was slice and dice the sample to see if one of those slices would show a correlation. But by pure chance, one of those slices would show a correlation, even there wasn’t one. As best illustrated in this cartoon, if you run twenty tests for something that has no correlation, statistics dictate that at least one of those will show a spurious correlation at the 95% confidence level. This is one of the reasons many scientists, especially geneticists, are turning to Bayesian analysis, which can account for this.

If you did a study of just a few African-American boys and found a connection between vaccination and autism, it would be the sort of preliminary shaky result you would use to justify looking at a larger sample … such as the full CDC study that the crackpot’s own analysis shows refutes such a connection. To take a large comprehensive study, narrow it down to a small sample and then claim the result of this small sample override those of the large one is ridiculous. It’s the opposite of how epidemiology works (and there is no suggestion that there is something about African American males that makes them more susceptible to vaccine-induced autism).

This sort of ridiculous cherry-picking happens a lot, mostly in political contexts. Education reformers will pore over test results until they find that fifth graders slightly improved their reading scores and claim their reform is working. When the scores revert back the next year, they ignore it. Drug warriors will pore over drug stats and claim that a small drop in heroine use among people in their 20′s indicates that the War on Drugs is finally working. When it reverts back to normal, they ignore it.

You can’t pick and choose little bits of data to support your theory. You have to be able to account for all of it. And you have to be aware of how often spurious results pop up even in the most objective and well-designed studies, especially when you parse the data finer and finer.

But the anti-vaxxers don’t care about that. What they care about is proving that evil vaccines and Big Pharma are poisoning us. And however they have to torture the data to get there, that’s what they’ll do.

Mathematical Malpractice Watch: Hurricanes

Monday, June 2nd, 2014

There’s a new paper out that claims that hurricanes with female names tend to be deadlier than ones with male names based on hurricane data going back to 1950. They attribute this to gender bias, the idea that people don’t take hurricanes with female-names seriously.

No, this is not the onion.

I immediately suspected a bias. For one thing, even with their database, we’re talking about 92 events, many of which killed zero people. More important, all hurricanes had female names until 1979. What else was true before 1979? We had a lot less advanced warning of hurricanes. In fact, if you look up the deadliest hurricanes in history, they are all either from times before we named them or when hurricanes all had female names. In other words, they may just be measuring the decline in hurricane deadliness.

Now it’s possible that the authors use some sophisticated model that also account for hurricane strength. If so, that might mitigate my analysis. But I’m dubious. I downloaded their spreadsheet, which is available for the journal website. Here is what I found:

Hurricanes before 1979 averaged 27 people killed.

Hurricanes since 1979 average 16 people killed.

Hurricanes since 1979 with male names average … 16 people killed.

Hurricanes since 1979 with female names averaged … 16 people killed.

Maybe I’m missing something. How did this get past a referee?

Update: Ed Yong raises similar points here. The authors say that cutting the sample at 1979 made the numbers too small and so therefore use an index of how feminine or masculine the names were. I find that dubious when a plain and simple average will give you an answer. Moreover, they try this qualifier in the comments:

What’s more, looking only at severe hurricanes that hit in 1979 and afterwards (those above $1.65B median damage), 16 male-named hurricane each caused 23 deaths on average whereas 14 female-named hurricanes each caused 29 deaths on average. This is looking at male/female as a simple binary category in the years since the names started alternating. So even in that shorter time window since 1979, severe female-named storms killed more people than did severe male-named storms.

You be the judge. I average 54 post-1978 storms totally 1200 deaths and get even numbers. They narrow it to 30 totally 800 deaths and claim a bias based on 84 excess deaths. That really crosses as stretching to make a point.

Update: My friend Peter Yoachim did a K-S test of the data and found a 97% chance that the male- and female-named hurricanes were drawn from the same distribution. This is a standard test of the null hypothesis and wasn’t done at all. Ridiculous.

Absolutely Nothing Happened in Sector 83 by 9 by 12 Today

Wednesday, May 28th, 2014

Last night, the science social media sphere exploded with the news of a potential … something … in our nearest cosmic neighbor, M31. The Swift mission, which I am privileged to work for, reported the discovery of a potential bright X-ray transient in M31, a sign of a high-energy event. For a while, we had very little to go on — Goddard had an unfortunately timed power outage. Some thought (and some blogs actually reported) that we’d seen a truly extraordinary event — perhaps even a nearby gamma-ray burst. But it turned out to be something more mundane. My friend and colleague Phil Evans has a great explanation:

It started with the Burst Alert Telescope, or BAT, on board Swift. This is designed to look for GRBs, but will ‘trigger’ on any burst of high-energy radiation that comes from an area of the sky not known to emit such rays. But working out if you’ve had such a burst is not straightforward, because of noise in the detector, background radiation etc. So Swift normally only triggers if it’s really sure the burst of radiation is real; for the statisticians among you, we have a 6.5-σ threshold. Even then, we occasionally get false alarms. But we also have a program to try to spot faint GRBs in nearby galaxies. For this we accept lower significance triggers from BAT if they are near a known, nearby galaxy. But these lower significance triggers are much more likely to be spurious. Normally, we can tell that they are spurious because GRBs (almost always) have a glow of X-rays detectable for some time after the initial burst, an ‘afterglow’. The spurious triggers don’t have this, of course.

In this case, it was a bit more complicated There was an X-ray source consistent with the BAT position. The image to the right shows the early X-ray data. The yellow circle shows the BAT error box – that is, the BAT told us it thought it had seen something in that circle. The orange box shows what the XRT could see at the time, and they grey dots are detected X-rays. The little red circle marks where the X-ray source is.

Just because the X-ray object was already known about, and was not something likely to go GRB doesn’t mean it’s boring. If the X-ray object was much brighter than normal, then it is almost certainly what triggered the BAT and is scientifically interesting. Any energetic outburst near to Earth is well worth studying. Normally when the Swift X-ray telescope observes a new source, we get various limited data products sent straight to Earth, and normally some software (written by me!) analyses those data. In this case, there was a problem analysing those data products, specifically the product from which we normally estimate the brightness. So the scientists who were online at the time were forced to use rougher data, and from those it looked like the X-ray object was much brighter than normal. And so, of course, that was announced.

The event occurred at about 6:15 EDT last night. I was feeding kids and putting them to bed but got to work on it after a couple of hours. At about 9:30, my wife asked what I was up to and I told her about a potential event in M31, but was cautious. I said something like: “This might be nothing; but if it is real, it would be huge.” I wish I could say I had some prescience about what the later analysis would show, but this was more my natural pessimism. That skeptical part of my mind kept going on about how unlikely a truly amazing event was (see here).

My role would turn out to be a small one. It turned out that Swift had observed the region before. And while Goddard and its HEASARC data archive were down, friend and fellow UVOT team member Caryl Gronwall reminded me that the MAST archive was not. We had not observed the suspect region of M31 in the same filters that Swift uses for its initial observations. But we knew there was a globular cluster near the position of the even and, by coincidence, I had just finished a proposal on M31′s globular clusters. I could see that the archival measures and the new measure were consistent with a typical globular cluster. Then we got a report from the GTC. Their spectrum only showed the globular cluster.

This didn’t disprove the idea of a transient, of course. Many X-ray transients don’t show a signature in the optical and it might not have been the globular cluster anyway. But it did rule out some of the more exotic explanations. Then the other shoe dropped this morning when the XRT team raced to their computers, probably still in their bathrobes. Their more detailed analysis showed that the bright X-ray source was a known source and had not brightened. So … no gamma-ray burst. No explosive event.

Phil again:

I imagine that, from the outside, this looks rather chaotic and disorganised. And the fact that this got publicity across the web and Twitter certainly adds to that! But in fact this highlights the challenges facing professional astronomers. Transient events are, by their nature, well, transient. Some are long lived, but others not. Indeed, this is why Swift exists, to enable us to respond very quickly to the detection of a GRB and gather X-ray, UV and optical data within minutes of the trigger. And Swift is programmed to send what it can of that data straight to the ground (limited bandwidth stops us from sending everything), and to alert the people on duty immediately. The whole reason for this is to allow us to quickly make some statements about the object in question so people can decide whether to observe it with other facilities. This ability has led to many fascinating discoveries, such as the fact that short GRBs are caused by two neutron stars merging, the detection of a supernova shock breaking out of a star and the most distant star even seen by humans, to name just 3. But it’s tough. We have limited data, limited time and need to say something quick, while the object is still bright. People with access to large telescopes need to make a rapid decision, do they sink some of their limited observing time into this object? This is the challenge that we, as time-domain astronomers, face on a daily basis. Most of this is normally hidden from the world at large because of course we only publish and announce the final results from the cases where the correct decisions were made. In this case, thanks to the power of social media, one of those cases where what proved to be the wrong decision has been brought into the public eye. You’ve been given a brief insight into the decisions and challenges we have to face daily. So while it’s a bit embarrassing to have to show you one of the times where we got it wrong, it’s also good to show you the reality of science. For every exciting news-worthy discovery, there’s a lot of hard graft, effort, false alarms, mistakes, excitement and disappointment. It’s what we live off. It’s science.

Bingo.

People sometimes ask me why I get so passionate about issues like global warming or vaccination or evolution. While the political aspects of these issues are debatable, I get aggravated when people slag the science, especially when it is laced with dark implications of “follow the money” or claims that scientists are putting out “theories” without supporting evidence. Skeptics claims, for example, that scientists only support global warming theory or vaccinations because they would not get grant money for claiming otherwise.

It is true: scientists like to get paid, just like everyone else. We don’t do this for free (mostly). But money won’t drag you out of bed at 4 in the morning to discover a monster gamma-ray burst. Money doesn’t keep you up until the wee hours pounding on a keyboard to figure out what you’ve just seen. Money didn’t bounce my Leicester colleagues out of bed at the crack of dawn to figure out what we were seeing. Money doesn’t sustain you through the years of grad school and the many years of soft-money itinerancy. Hell, most scientists could make more money if they left science. One of the best comments I ever read on this was on an old slash-dot forum: “Doing science for the money is like having sex for the exercise.”

What really motivates scientists is the answer. What really motivates them is finding out something that wasn’t known before. I have been fortunate in my life to have experienced that joy of discovery a few times. There have been moments when I realized that I was literally the only person on Earth to know something, even if that something was comparatively trivial, like the properties of a new dwarf galaxy. That’s the thrill. And despite last night’s excitement being in vain, it was still thrilling to hope that we’d seen something amazing. And hell, finding out it was not an amazing event was still thrilling. It’s amazing to watch the corrective mechanisms of the scientific method in action, especially over the time span of a few hours.

Last night, science was asked a question: did something strange happen in M31? By this morning, we had the answer: no. That’s not a bad day for science. That’s a great one.

One final thought: one day, something amazing is going to happen in the Local Universe. Some star will explode, some neutron stars will collide or something we haven’t even imagined will happen. It is inevitable. The question is not whether it will happen. The question is: will we still be looking?

Low Class Cleavage

Monday, March 31st, 2014

It’s the end of the month, so time to put up a few posts I’ve been tinkering with.

No, just give the Great Unwashed a pair of oversized breasts and a happy ending, and they’ll oink for more every time.

– Charles Montgomery Burns

A few months ago, this study was brought to my attention:

It has been suggested human female breast size may act as signal of fat reserves, which in turn indicates access to resources. Based on this perspective, two studies were conducted to test the hypothesis that men experiencing relative resource insecurity should perceive larger breast size as more physically attractive than men experiencing resource security. In Study 1, 266 men from three sites in Malaysia varying in relative socioeconomic status (high to low) rated a series of animated figures varying in breast size for physical attractiveness. Results showed that men from the low socioeconomic context rated larger breasts as more attractive than did men from the medium socioeconomic context, who in turn perceived larger breasts as attractive than men from a high socioeconomic context. Study 2 compared the breast size judgements of 66 hungry versus 58 satiated men within the same environmental context in Britain. Results showed that hungry men rated larger breasts as significantly more attractive than satiated men. Taken together, these studies provide evidence that resource security impacts upon men’s attractiveness ratings based on women’s breast size.

Sigh. It seems I am condemned to writing endlessly about mammary glands. I don’t have an objection to the subject but I do wish someone else would approach these “studies” with any degree of skepticism.

This is yet another iteration of the breast size study I lambasted last year and it runs into the same problems: the use of CG figures instead of real women, the underlying inbuilt assumptions and, most importantly, ignoring the role that social convention plays in this kind of analysis. To put it simply: men may feel a social pressure to choose less busty CG images, a point I’ll get to in a moment. I don’t see that this study sheds any new light on the subject. Men of low socioeconomic status might still feel less pressure to conform to social expectations, something this study does not seem to address at all. Like most studies of human sexuality, it makes the fundamental mistake of assuming that what people say is necessary reflective of what they think or do and not what is expected of them.

The authors think that men’s preference for bustier women when they are hungry supports their thesis that the breast fetish is connected to feeding young (even though is zero evidence that large breasts nurse better than small ones). I actually think their result has no bearing on their assumption. Why would hungrier men want fatter women? Because they want to eat them? To nurse off them? I can think of good reasons why hungry men would feel less bound by social convention, invest a little less thought in a silly social experiment and just press the button for the biggest boobs. I think that hungry men are more likely to give you an honest opinion and not care that preferring the bustier woman is frowned upon. Hunger is known to significantly alter people’s behavior in many subtle ways but these authors narrow it to one dimension, a dimension that may not even exist.

And why not run a parallel test on women? If bigger breasts somehow provoke a primal hunger response, might that preference be built into anyone who nursed in the first few years of life?

No, this is another garbage study that amounts to saying that “low-class” men like big boobs while “high-class” men are more immune to the lure of the decolletage and so … something. I don’t find that to be useful or insightful or meaningful. I find that it simply reinforces an existing preconception.

There is a cultural bias in some of the upper echelons of society against large breasts and men’s attraction to them. That may sound crazy in a society that made Pamela Anderson a star. But large breasts and the breast fetish are often seen, by elites, as a “low class” thing. Busty women in high-end professions sometimes have problems being taken seriously. Many busty women, including my wife, wear minimizer bras so they’ll be taken more seriously (or look less matronly). I’ve noticed that in the teen shows my daughter sometimes watches, girls with curves are either ditzy or femme fatales. In adult comedies, busty women are frequently portrayed as ditzy airheads. Men who are attracted to buxom women are often depicted as low-class, unintelligent and uneducated. Think Al Bundy.

This is, of course, a subset of a mentality that sees physical attraction itself as a low-class animalistic thing. Being attracted to a woman because she’s a Ph.D. is obviously more cultured, sophisticated and enlightened than being attracted to a woman because she’s a DD. I don’t think attraction is monopolar like that. As I noted before, a man’s attraction to a woman is affected by many factors — her personality, her intelligence, her looks. Breast size is just one slider on the circuit board that it is men’s sexuality and probably not even the most important. But it’s absurd to pretend the slider doesn’t exist or that it is somehow less legitimate than the others. We are animals, whatever our pretensions.

Last year, a story exploded on the blogosphere about a naive physics professor who was duped into becoming a drug mule by the promise that he would marry Denise Milani, an extremely buxom non-nude model. What stunned me in reading about the story was the complete lack of any sympathy for him. Granted, he is an arrogant man who isn’t particularly sympathetic. But a huge amount of abuse was heaped on him, much of it focusing on his fascination with a model and particularly a model with extremely large and likely artificial breasts. The tone was that there must be something idiotic and crude about the man to fall for such a ruse and for such a woman.

The reaction to the story not only illuminated a cultural bias but how that bias can become particularly potent when the breasts in question are implants. The expression “big fake boobs” is a pejorative that men and women love to hurl at women they consider low class or inferior. Take Jenny McCarthy. There are very good reasons to criticize McCarthy for her advocacy of anti-vaccine hysteria (although I think the McCarthy criticism is a bit overblown since most people are getting this information elsewhere and McCarthy wasn’t the one who committed research fraud). But no discussion of McCarthy is complete until someone has insulted her for having implants and the existence of those implants has been touted as a sign of her obvious stupidity and the stupidity of those who follow her.

McCarthy actually doesn’t cross me as that stupid; she crosses me as badly misinformed. And it’s not like there aren’t hordes of very smart people who haven’t bought into the anti-vaccine nonsense even sans McCarthy. But putting that aside, I don’t know what McCarthy’s breasts have to do with anything. Do people honestly think it would make a difference is she was an A-cup?

To return to this study and the one I lambasted last year: what I see is not only bad science but a subtle attempt by science to reinforce the stereotype that large breasts and an attraction to them are animalistic, low-class and uneducated. Bullshit speculation claims that men’s attraction to breasts is some primitive instinct. And more bullshit research claims that wealthy educated men can resist this primitive instinct but poorer less-educated men wallow in their animalistic desires. And when these garbage studies come out, blogs are all too eager to hype them, saying, “See! We told you those guys who liked big boobs were ignorant brutes!”

I think this is just garbage. The most “enlightened” academic is just as likely to ogle a busty woman when she walks by. He might be better trained at not being a jerk about it because he walks in social circles where wolf-whistles and come-ons are unacceptable. And he lives in a society where, if a bunch of social scientists are leering over you, you pretend to like the less busty woman. But all men live secret erotic lives in their heads. It’s extremely difficult to tease that information out and certainly not possible with an experiment as crude and obvious as this.

Once again, we see the biggest failing in sex research: asking people what they want instead of getting some objective measure. There are better approaches, some of which I mentioned in my previous article. If I were to approach this topic, I would look at the google search database used in A Billion Wicked Thoughts to see if areas of high education (e.g., college towns) were less likely to look at porn in general and porn involving busty women in particular. That might give you some useful information. But there’s a danger that it wouldn’t enforce the bias we’ve built up against big breasts and the men who love them.

Mathematical Malpractice Watch: A Trilogy of Error

Wednesday, February 12th, 2014

Three rather ugly instances of mathematical malpractice have caught my attention in the last month. Let’s check them out.

The Death of Facebook or How to Have Fun With Out of Sample Data

Last month, Princeton researchers came out with the rather spectacular claim that the social network Facebook would be basically dead within a few years. The quick version is that they fit an epidemiological model to the rise and fall of MySpace. They then used that same model, varying the parameters, to fit Google trends on searches for Facebook. They concluded that Facebook would lose 80% of its customers by 2017.

This was obviously nonsese as detailed here and here. It suffered from many flaws, notably assuming that the rise and fall of MySpace was necessarily a model for all social networks and the dubious method of using Google searches instead of publicly available traffic data as their metric.

But there was a deeper flaw. The authors fit a model of a sharp rise and fall. They then proclaim that this model works because Facebook’s google data follows the first half of that trend and a little bit of the second. But while the decline in Facebook Google searches is consistent with their model, it is also consistent with hundreds of others. It would be perfectly consistent with a model that predicts a sharp rise and then a leveling off as the social network saturates. Their data are consistent with but not discriminating against just about any model.

The critical part of the data — the predicted sharp fall in Facebook traffic — is out of sample (meaning it hasn’t happened yet). But based on a tiny sliver of data, they have drawn a gigantic conclusion. It’s Mark Twain and the length of the Mississippi River all over again.

We see this a lot in science, unfortunately. Global warming models often predict very sharp rises in temperature — out of sample. Models of the stock market predict crashes or runs — out of sample. Sports twerps put together models that predict Derek Jeter will get 4000 hits — out of sample.

Anyone who does data fitting for a living knows this danger. The other day, I fit a light curve to a variable star. Because of an odd intersection of Fourier parameters, the model predicted a huge rise in brightness in the middle of its decay phase because there were no data to constrain it there. So it fit a small uptick in the decay phase as though it were the small beginning of a massive re-brightening.

The more complicated the model, the more danger there is of drawing massive conclusions from tiny amounts of data or small trends. If the model is anything other than a straight line, be very very wary at out-of-sample predictions, especially when they are predicting order-of-magnitude changes.

A Rape Epidemic or How to Reframe Data:

The CDC recently released a study that claimed that 1.3 million women were raped and 12.6 million more were subject to sexual violence in 2010. This is six or more times the estimates of the FBI’s extremely rigorous NCVS estimate. Christina Hoff Summers has a breakdown of why the number is so massive:

It found them by defining sexual violence in impossibly elastic ways and then letting the surveyors, rather than subjects, determine what counted as an assault. Consider: In a telephone survey with a 30 percent response rate, interviewers did not ask participants whether they had been raped. Instead of such straightforward questions, the CDC researchers described a series of sexual encounters and then they determined whether the responses indicated sexual violation. A sample of 9,086 women was asked, for example, “When you were drunk, high, drugged, or passed out and unable to consent, how many people ever had vaginal sex with you?” A majority of the 1.3 million women (61.5 percent) the CDC projected as rape victims in 2010 experienced this sort of “alcohol or drug facilitated penetration.”

What does that mean? If a woman was unconscious or severely incapacitated, everyone would call it rape. But what about sex while inebriated? Few people would say that intoxicated sex alone constitutes rape — indeed, a nontrivial percentage of all customary sexual intercourse, including marital intercourse, probably falls under that definition (and is therefore criminal according to the CDC).

Other survey questions were equally ambiguous. Participants were asked if they had ever had sex because someone pressured them by “telling you lies, making promises about the future they knew were untrue?” All affirmative answers were counted as “sexual violence.” Anyone who consented to sex because a suitor wore her or him down by “repeatedly asking” or “showing they were unhappy” was similarly classified as a victim of violence. The CDC effectively set a stage where each step of physical intimacy required a notarized testament of sober consent.

In short, they did what is called “reframing”. They took someone’s experiences, threw away that person’s definition of them and substituted their own definition.

This isn’t the first time this has happened with rape stats nor the first time Summers had uncovered this sort of reframing. Here is an account of how researchers decided that women who didn’t think they had been raped were, in fact, raped, so they could claim a victimization rate of one in four.

Scientists have to classify things all the time based on a variety of criteria. The universe is a messy continuum; to understand it, we have to sort things into boxes. I classify stars for a living based on certain characteristics. The problem with doing that here is that women are not inanimate objects. Nor are they lab animals. They can have opinions of their own about what happened to them.

I understand that some victims may reframe their experiences to try to lessen the trauma of what happened to them. I understand that a woman can be raped but convince herself it was a misunderstanding or that it was somehow her fault. But to a priori reframe any woman’s experience is to treat them like lab rats, not human beings capable of making judgements of their own.

But it also illustrates a mathematical malpractice problem: changing definitions. This is how 10,000 underage prostitutes in the United States becomes 200,000 girls “at risk”. This is how small changes in drug use stats become an “epidemic”. If you dig deep into the studies, you will find the truth. But the banner headline — the one the media talk about — is hopelessly and deliberately muddled.

Sometimes you have to change definitions. The FBI changed their NCVS methodology a few years ago on rape statistics and saw a significant increase in their estimates. But it’s one thing to hone; it’s another to completely redefine.

(The CDC, as my friend Kevin Wilson pointed out, mostly does outstanding work. But they have a tendency to jump with both feet into moral panics. In this case, it’s the current debate about rape culture. Ten years ago, it was obesity. They put out a deeply flawed study that overestimated obesity deaths by a factor of 14. They quickly admitted their screwup but … guess which number has been quoted for the last decade on obesity policy?)

You might ask why I’m on about this. Surely any number of rapes is too many. The reason I wanted to talk about this, apart from my hatred of bogus studies, is that data influences policy. If you claim that 1.3 million women are being raped every year, that’s going to result in a set of policy decisions that are likely to be very damaging and do very little to address the real problem.

If you want a stat that means something, try this one: the incidence of sexual violence has fallen 85% over the last 30 years. That is from the FBI’s NCVS data so even if they are over- or under-estimating the amount of sexual violence, the differential is meaningful. That data tells you something useful: that whatever we are doing to fight rape culture, it is working. Greater awareness, pushing back against blaming the victim, changes to federal and state laws, changes to the emphasis of attorneys general’s offices and the rise of internet pornography have all been cited as contributors to this trend.

That’s why it’s important to push back against bogus stats on rape. Because they conceal the most important stat; the one that is the most useful guide for future policy and points the way toward ending rape culture.

The Pending Crash or How to Play with Scales:

Yesterday morning, I saw a chart claiming that the recent stock market trends are an eerie parallel of the run-up to the 1929 crash. I was immediately suspicious because, even if the data were accurate, we see this sort of crap all the time. There are a million people who have made a million bucks on Wall Street claiming to pattern match trends in the stock market. They make huge predictions, just like the Facebook study above. And those predictions are always wrong. Because, again, the out of sample data contains the real leverage.

This graph is even worse than that, though. As Quartz points out, the graph makers used two different y-axes. In one, the the 1928-29 rise of the stock market was a near doubling. In the other, the 2013-4 rise was an increase of about 25%. When you scale them appropriately, the similarity vanishes. Or, alternatively, the pending “crash” would be just an erasure of that 25% gain.

I’ve seen this quite a bit and it’s beginning to annoy me. Zoomed-in graphs of narrow ranges of the y-axis are used to draw dramatic conclusions about … whatever you want. This week, it’s the stock market. Next week, it’s global warming skeptics looking at little spikes on a 10-year temperature plot instead of big trends on a 150-year one. The week after, it will be inequality data. Here is one from Piketty and Saez, which tracks wealth gains for the rich against everyone else. Their conclusion might be accurate but the plot is useless because it is scaled to intervals of $5 million. So even if the bottom 90% were doing better, even if their income was doubling, it wouldn’t show up on the graph.

Halloween Linkorama

Sunday, November 3rd, 2013

Three stories today:

  • Bill James once said that, when politics is functioning well, elections should have razor thin margins. The reason is that the parties will align themselves to best exploit divisions in the electorate. If one party is only getting 40% of the vote, they will quickly re-align to get higher vote totals. The other party will respond and they will reach a natural equilibrium near 50% I think that is the missing key to understanding why so many governments are divided. The Information Age has not only given political parties more information to align themselves with the electorate, it has made the electorate more responsive. The South was utterly loyal the Democrats for 120 years. Nowadays, that kind of political loyalty is fading.
  • I love this piece about how an accepted piece of sociology turned out to be complete gobbledygook.
  • Speaking of gobbledygook, here is a review of the article about men ogling women. It sounds like the authors misquoted their own study.
  • Rush is Wrong on Religion

    Friday, September 20th, 2013

    I see that Rush Limbaugh has dived into the latest climate nontroversy. That makes this is a good time to post this, which I wrote several months ago. Sorry to make this Global Warming Week. I hate that debate. But with the way the Daily Fail’s nonsense is propagating, I have no choice.

    (more…)

    Mathematical Malpractice Watch: Cherry-Picking

    Sunday, September 15th, 2013

    Probably one of the most frustrating mathematical practices is the tendency of politicos to cherry-pick data: only take the data points that are favorable to their point of view and ignore all the others. I’ve talked about this before but two stories circling the drain of the blogosphere illustrated this practice perfectly.

    The first is on the subject of global warming. Global warming skeptics have recently been crowing about two pieces of data that supposedly contradict the theory of global warming: a slow-down in temperature rise over the last decade and a “60% recovery” in Arctic sea ice.

    The Guardian, with two really nice animated gifs, show clearly why these claims are lacking. Sea ice levels vary from year to year. The long-term trend, however, has been a dramatic fall with current sea ice levels being a third of what they were a few decades ago (and that’s just area: in terms of volume it’s much worse with sea ice levels being a fifth of what they were). The 60% uptick is mainly because ice levels were so absurdly low last year that the natural year-to-year variation is equal to almost half the total area of ice. In other words, the variation in yearly sea levels has not changed — the baseline has shrunk so dramatically that the variations look big in comparison. This could easily — and likely will — be matched by a 60% decline. Of course, that decline will be ignored by the very people hyping the “recovery”.

    Temperature does the same thing. If you look at the second gif, you’ll see the steady rise in temperature over the last 40 years. But, like sea ice levels, planetary temperatures vary from year to year. The rise is not perfect. But each time it levels or even falls a little, the skeptics ignore forty years worth of data.

    (That having been said, temperatures have been rising much slower for the last decade than they were for the previous three. A number of climate scientists now think we have overestimated climate sensitivity).

    But lest you think this sort of thing is only confined to the Right …

    Many people are tweeting and linking this article which claims that Louis Gohmert spouted 12 lies about Obamacare in two minutes. Some of the things Gohmert said were not true. But other were and still others can not really be assessed at this stage. To take on the lies one-by-one:

    Was Obamacare passed against the will of the people?

    Nope. It was passed by a president who won the largest landslide in two decades and a Democratic House and Senate with huge majorities. It was passed with more support than the Bush tax cuts and Medicare Part D, both of which were entirely unfunded. And the law had a mostly favorable perception in 2010 before Republicans spent hundreds of millions of dollars spreading misinformation about it.

    The first bits of that are true but somewhat irrelevant: the Iraq War had massive support at first, but became very unpopular. The second is cherry-picked. Here is the Kaiser Foundation’s tracking poll on Obamacare (panel 6). Obamacare barely crested 50% support for a brief period, well within the noise. Since then, it has had higher unfavorables. If anything, those unfavorables have actually fallen slightly, not risen in response to “Republican lies”.

    Supporters of the law have devised a catch-22 on the PPACA: if support falls, it’s because of Republican money; if it rises it’s because people are learning to love the law. But the idea that there could be opposition to it? Perish the thought!

    Is Obamacare still against the will of American people?

    Actually, most Americans want it implemented. Only 6 percent said they wanted to defund or delay it in a recent poll.

    That is extremely deceptive. Here is the poll. Only 6% want to delay or defund the law because 30% want it completely repealed. Another 31% think it needs to be improved. Only 33% think the law should be allowed to take effect or be expanded.

    (That 6% should really jump out at you since it’s completely at variance with any political reality. The second I saw it, I knew it was garbage. Maybe they should have focus-group-tested it first to come up with some piece of bullshit that was at least believable.)

    Of the remaining questions, many are judgement calls on things that have yet to happen. National Memo asserts that Obamacare does not take away your decisions about health care, does not put the government between you and your doctor and will not keep seniors from getting the services they need. All of these are judgement calls about things that have yet to happen. There are numerous people — people who are not batshit crazy like Gohmert — who think that Obamacare and especially the IPAB will eventually create government interference in healthcare. Gohmert might be wrong about this. But to call it a lie when someone makes a prediction about what will happen is absurd. Let’s imagine this playing out in 2002:

    We rate Senator Liberal’s claim that we will be in Iraq for a decade and it will cost 5000 lives and $800 billion to be a lie. The Bush Administration has claimed that US troops will be on the ground for only a few years and expect less than a thousand casualties and about $2 billion per month. In fact, some experts predict it will pay for itself.

    See what I did there?

    Obamacare is a big law with a lot of moving parts. There are claims about how it is going to work but we won’t really know for a long time. Maybe the government won’t interfere with your health care. But that’s a big maybe to bet trillions of dollars on.

    The article correctly notes that the government will not have access to medical records. But then it is asserts that any information will be safe. This point was overtaken by events this week when an Obamacare site leaked 2400 Social Security numbers.

    See what I mean about “fact-checking” things that have yet to happen?

    Then there’s this:

    Under Obamacare, will young people be saddled with the cost of everybody else?

    No. Thanks to the coverage for students, tax credits, Medicaid expansion and the fact that most young people don’t earn that much, most young people won’t be paying anything or very much for health care. And nearly everyone in their twenties will see premiums far less than people in their 40s and 50s. If you’re young, out of school and earning more than 400 percent of the poverty level, you may be paying a bit more, but for better insurance.

    This is incorrect. Many young people are being coerced into buying insurance that they wouldn’t have before. As Avik Roy has pointed out, cheap high-deductible plans have been effectively outlawed. Many college and universities are seeing astronomical rises in health insurance premiums, including my own. The explosion of invasive wellness programs, like UVAs, has been explicitly tied to the PPACA. Gohmert is absolutely right on this one.

    The entire point of Obamacare was to get healthy people to buy insurance so that sick people could get more affordable insurance. That is how this whole thing works. It’s too late to back away from that reality now.

    Does Obamacare prevent the free exercise of your religious beliefs?

    No. But it does stop you from forcing your beliefs on others. Employers that provide insurance have to offer policies that provide birth control to women. Religious organizations have been exempted from paying for this coverage but no one will ever be required to take birth control if their religion restricts it — they just can’t keep people from having access to this crucial, cost-saving medication for free.

    This is a matter of philosophy. Many liberals think that if an employer will not provide birth control coverage to his employees, he is “forcing” his religious views upon them (these liberals being under the impression that free birth control pills are a right). I, like many libertarians and conservatives (and independents), see it differently: that forcing someone to pay for something with which they have a moral qualm is violating their religious freedom. The Courts have yet to decide on this.

    I am reluctant to call something a “lie” when it’s a difference of opinion. Our government has made numerous allowance for religious beliefs in the past, including exemptions from vaccinations, the draft, taxes and anti-discrimination laws. We are still having a debate over how this applies to healthcare. Sorry, National Memo, that debate isn’t over yet.

    So let’s review. Of Gohmert’s 12 “lies”, the breakdown is like so:

    Lies: 4
    Debatable or TBD: 5
    Correct: 3
    Redundant: 1

    (You’ll note that’s 13 “lies”; apparently National Memo can’t count).

    So 4 only out of 13 are lies. Hey, even Ty Cobb only hit .366

    Mathematical Malpractice: Focus Tested Numbers

    Tuesday, September 3rd, 2013

    One of the things I keep encountering in news, culture and politics are numbers that appear to be pulled out of thin air. Concrete numbers, based on actual data, are dangerous enough in the wrong hands. But when data get scarce, this doesn’t seem to intimidate advocates and some social scientists. They will simply commission a “study” that produces, in essence, any number they want.

    What is striking is that the numbers seem to be selected with the diligent care and skill that the methods lack.

    The first time I became aware of this was with Bill Clinton. According to his critics — and I can’t find a link on this so it’s possibly apocryphal — when Bill Clinton initiated competency tests for Arkansas teachers, a massive fraction failed. He knew the union would blow their stack if the true numbers were released so he had focus groups convened to figure out what percentage of failures was expected, then had the test curved so that the results met the expectation.

    As I said, I can’t find a reference for that. I seem to remember hearing it from Limbaugh, so it may be a garbled version (I can find lawsuits about race discrimination with the testing, so it’s possible a mangled version of that). But the story struck me to the point where I remember it twenty years later. And the reason it struck is because:

  • It sounds like the sort of thing politicians and political activists would do.
  • It would be amazingly easy to do.
  • Our media are so lazy that you could probably get away with it.
  • Since then, I’ve seen other numbers which I call “focus tested numbers” even tough they may not have been run by focus groups. But they cross me as numbers derived by someone coming up with the number first and then devising the methodology second. They first part is the critical one. Whatever the issue is, you have to come with a number that is plausible and alarming without being ridiculous. Then you figure out the methods to get the number.

    Let’s just take an example. The first time I became aware of the work of Maggie McNeill was her thorough debunking of the claim that 200,000 underage girls are trafficked for sex in the United States. You should read that article, which comes to an estimate of about 15,000 total underage prostitutes (most which are 16 or 17) and only a few hundred to a few thousand that are trafficked in any meaningful sense of that word. That does not make the problem less important, but it does make it less panic-inducing.

    But the 200,000 number jumped out at me. Here’s my very first comment on Maggie’s blog and her response:

    Me: Does anyone know where the 100,000 estimate comes from? What research it’s based on?

    It’s so close to 1% [of total underage girls] that I suspect it may be as simple as that. We saw a similar thing in the 1980′s when Mitch Snyder claimed (and the media mindlessly repeated) that three million Americans were homeless (5-10 times the estimates from people who’d done their homework). It turned out the entire basis of that claim was that three million was 1% of the population.

    This is typical of the media. The most hysterical claim gets the most attention. If ten researchers estimates there are maybe 20,000 underage prostitutes and one big-mouth estimates there are 300,000, guess who gets a guest spot on CNN?

    —–

    Maggie: Honestly, I think 100,000 is just a good large number which sounds impressive and is too large for most people to really comprehend as a whole. The 300,000 figure appears to be a modification of a figure from a government report which claimed that something like 287,000 minors were “at risk” from “sexual exploitation” (though neither term was clearly defined and no study was produced to justify the wild-ass guess). It’s like that game “gossip” we played as children; 287,000 becomes 300,000, “at risk” becomes “currently involved” and “sexual exploitation” becomes “sex trafficking”. :-(

    The study claimed that 100-300,000 girls were “at risk” of exploitation but defined “at risk” so loosely that simply living near a border put someone at risk. With such methods, the authors could basically claim any number they wanted. After reading that analysis and picking my jaw up off of the floor, I wondered why anyone would do it that way.

    And then it struck me: because the method wasn’t the point; the result was. Even the result wasn’t the point; the issue they wanted to advocate was. The care was not in the method: it was in the number. If they had said that there were a couple of thousand underage children in danger, people would have said, “Oh, OK. That sounds like something we can deal with using existing policies and smarter policing.” Or even worse, they might have said, “Well, why don’t we legalize sex work for adults and concentrate on saving these children?” If they had claimed a million children were in danger, people would have laughed. But claim 100-300,000? That’s enough to alarm people into action without making them laugh. It’s in the sweet spot between the “Oh, is that all?” number of a couple thousand and the “Oh, that’s bullshit” number of a million.

    Another great example was the number SOPA supporters bruted about to support their vile legislation. Julian Sanchez details the mathematical malpractice here. At first, they claimed that $250 billion was lost to piracy every year. That number — based on complete garbage — was so ridiculous they had to revise it down to $58 billion. Again, notice how well-picked that number is. At $250 billion, people laughed. If they had gone with a more realistic estimate — a few billion, most likely — no one would have supported such draconian legislation. But $58 billion? That’s enough to alarm people, not enough to make them laugh and — most importantly — not enough to make the media do their damn job and check it out.

    I encountered it again today. The EU is proposing to put speed limiters on cars. Their claim is this will cut traffic deaths by a third. Now, we actually do have some data on this. When the national speed limit was introduced in America, traffic fatalities initially fell about 20%, but then slowly returned to normal. They began falling again, bumped up a bit when Congress loosened the law, then leveled out in the 90′s and early 00′s after Congress completely repealed the national speed limit. The fatality rate has plunged over the last few years and is currently 40% below the 1970′s peak — without a speed limit.

    That’s just raw numbers, of course. In real terms — per million vehicle miles driven — fatalities have plunged almost 75% of the last forty years, with no effect of the speed limit law. Of course, more cars contain single drivers than ever before. But even on a per capita basis, car fatalities are half of what they once were.

    That’s real measurable progress. Unfortunately for the speed limiters, it’s result of improved technology and better enforcement of drunk driving laws.

    So the claim that deaths from road accidents will plunge by a third because of speed limits is simply not supported by data in the United States. They might plunge as technology, better roads and laws against drunk driving spread to Eastern Europe. And I’m sure one of the reasons they are pushing for speed limits is that they can claim credit for that inevitable improvement. But a one-third decline is just not realistic.

    No, I suspect that this is a focus tested number. If they claimed fatalities would plunge by half, people would laugh. If they claimed 1-2%, no one would care. But one-third? That’s in the sweet spot.

    Bulbs

    Saturday, August 31st, 2013

    I have quite a few posts in the queue that will come out in the next few weeks but this has been my quietest month ever on the blog. One thing I did want to post on, however, came to a head tonight. While working in the basement, I knocked over a basket of bulbs and one shattered. Of course, it was a CFL with mercury in it so I had to follow the EPA’s elaborate instructions for cleaning up. Because it was the basement, I couldn’t take the most important step — airing out the room.

    Of course, the amount of mercury in CFL’s is very small — a couple of mg. I probably got ten times the exposure when I dropped and broke a mercury thermometer as a kid and then played with the mercury for a while. But still, these things were foisted on us and encouraged before anyone had really explained the potential danger (in parts of the world, they’re now mandatory). The EPA has done an analysis showing that, on balance, less mercury will be released into the environment because of the decreased amount of coal burnt to power the bulbs. However, I’m not sure this analysis is accurate since 1) history shows that greater energy efficiency mostly results in us using more powered devices: energy use tends to rise or be flat; 2) coal is slowly dying an industry. Powered by gas or nuclear, it’s likely that CFL’s will put more mercury into the environment. It also ignores the aspect that having mercury in the air from power plants is a little different from having it on the floor where your children play.

    LED bulbs are better but … they have their own concerns, which no one talks about.

    Global warming is real — one of my queued posts is on that subject. But the environmental movement has become fixated on it almost to the exclusion of all else. There is no such thing as perfect technology. Wind and solar require dirty manufacturing techniques and extensive use of rare-earth elements (that have to be mined). Nuclear has its obvious dangers. Fracking is less carbon-intense than coal, but doesn’t come without its own set of risks.

    The problem is that we do not talk about these trade-offs. We don’t balance rare-earth mining versus radioactive waste versus carbon emissions. We simply get into tizzies about global warming or nuclear waste and stampede toward something that looks good. And that extends into the home. On balance, I might take an LED or CFL light because it saves money, saved energy and the toxin risk is low. But that choice should not be mandated. People should be free to make their own evaluations of the tradeoffs.

    Saturday Linkorama

    Sunday, June 23rd, 2013
  • This visualization of the Right of Spring is seriously seriously cool. Seeing the music like that, you start hearing the subtleties that elude you when you just hear it. This is one of the reasons I like to see classical music in performance. There is so much more going on than the ear can take in.
  • This map of linguistic divides in the United States, is something I could spend an entire post on. I match most of the pronunciations from Georgia except for “lawyer” and “pajamas”.
  • This story, about charities that just exist to raise money, should be getting national attention. It’s a disgrace.
  • I’ve used some of these.
  • Roman concrete was apparently better than the shit we’re using.
  • I think this is more or less true: the financial industry has stopped being about enabling economic progress and more about itself. When engineers can make more moving piles of money around than inventing things, we’ve got a problem.
  • Teenage boys killed the sex scene.