Posts Tagged ‘Economics’

Mathematical Malpractice Watch: A Trilogy of Error

Wednesday, February 12th, 2014

Three rather ugly instances of mathematical malpractice have caught my attention in the last month. Let’s check them out.

The Death of Facebook or How to Have Fun With Out of Sample Data

Last month, Princeton researchers came out with the rather spectacular claim that the social network Facebook would be basically dead within a few years. The quick version is that they fit an epidemiological model to the rise and fall of MySpace. They then used that same model, varying the parameters, to fit Google trends on searches for Facebook. They concluded that Facebook would lose 80% of its customers by 2017.

This was obviously nonsese as detailed here and here. It suffered from many flaws, notably assuming that the rise and fall of MySpace was necessarily a model for all social networks and the dubious method of using Google searches instead of publicly available traffic data as their metric.

But there was a deeper flaw. The authors fit a model of a sharp rise and fall. They then proclaim that this model works because Facebook’s google data follows the first half of that trend and a little bit of the second. But while the decline in Facebook Google searches is consistent with their model, it is also consistent with hundreds of others. It would be perfectly consistent with a model that predicts a sharp rise and then a leveling off as the social network saturates. Their data are consistent with but not discriminating against just about any model.

The critical part of the data — the predicted sharp fall in Facebook traffic — is out of sample (meaning it hasn’t happened yet). But based on a tiny sliver of data, they have drawn a gigantic conclusion. It’s Mark Twain and the length of the Mississippi River all over again.

We see this a lot in science, unfortunately. Global warming models often predict very sharp rises in temperature — out of sample. Models of the stock market predict crashes or runs — out of sample. Sports twerps put together models that predict Derek Jeter will get 4000 hits — out of sample.

Anyone who does data fitting for a living knows this danger. The other day, I fit a light curve to a variable star. Because of an odd intersection of Fourier parameters, the model predicted a huge rise in brightness in the middle of its decay phase because there were no data to constrain it there. So it fit a small uptick in the decay phase as though it were the small beginning of a massive re-brightening.

The more complicated the model, the more danger there is of drawing massive conclusions from tiny amounts of data or small trends. If the model is anything other than a straight line, be very very wary at out-of-sample predictions, especially when they are predicting order-of-magnitude changes.

A Rape Epidemic or How to Reframe Data:

The CDC recently released a study that claimed that 1.3 million women were raped and 12.6 million more were subject to sexual violence in 2010. This is six or more times the estimates of the FBI’s extremely rigorous NCVS estimate. Christina Hoff Summers has a breakdown of why the number is so massive:

It found them by defining sexual violence in impossibly elastic ways and then letting the surveyors, rather than subjects, determine what counted as an assault. Consider: In a telephone survey with a 30 percent response rate, interviewers did not ask participants whether they had been raped. Instead of such straightforward questions, the CDC researchers described a series of sexual encounters and then they determined whether the responses indicated sexual violation. A sample of 9,086 women was asked, for example, “When you were drunk, high, drugged, or passed out and unable to consent, how many people ever had vaginal sex with you?” A majority of the 1.3 million women (61.5 percent) the CDC projected as rape victims in 2010 experienced this sort of “alcohol or drug facilitated penetration.”

What does that mean? If a woman was unconscious or severely incapacitated, everyone would call it rape. But what about sex while inebriated? Few people would say that intoxicated sex alone constitutes rape — indeed, a nontrivial percentage of all customary sexual intercourse, including marital intercourse, probably falls under that definition (and is therefore criminal according to the CDC).

Other survey questions were equally ambiguous. Participants were asked if they had ever had sex because someone pressured them by “telling you lies, making promises about the future they knew were untrue?” All affirmative answers were counted as “sexual violence.” Anyone who consented to sex because a suitor wore her or him down by “repeatedly asking” or “showing they were unhappy” was similarly classified as a victim of violence. The CDC effectively set a stage where each step of physical intimacy required a notarized testament of sober consent.

In short, they did what is called “reframing”. They took someone’s experiences, threw away that person’s definition of them and substituted their own definition.

This isn’t the first time this has happened with rape stats nor the first time Summers had uncovered this sort of reframing. Here is an account of how researchers decided that women who didn’t think they had been raped were, in fact, raped, so they could claim a victimization rate of one in four.

Scientists have to classify things all the time based on a variety of criteria. The universe is a messy continuum; to understand it, we have to sort things into boxes. I classify stars for a living based on certain characteristics. The problem with doing that here is that women are not inanimate objects. Nor are they lab animals. They can have opinions of their own about what happened to them.

I understand that some victims may reframe their experiences to try to lessen the trauma of what happened to them. I understand that a woman can be raped but convince herself it was a misunderstanding or that it was somehow her fault. But to a priori reframe any woman’s experience is to treat them like lab rats, not human beings capable of making judgements of their own.

But it also illustrates a mathematical malpractice problem: changing definitions. This is how 10,000 underage prostitutes in the United States becomes 200,000 girls “at risk”. This is how small changes in drug use stats become an “epidemic”. If you dig deep into the studies, you will find the truth. But the banner headline — the one the media talk about — is hopelessly and deliberately muddled.

Sometimes you have to change definitions. The FBI changed their NCVS methodology a few years ago on rape statistics and saw a significant increase in their estimates. But it’s one thing to hone; it’s another to completely redefine.

(The CDC, as my friend Kevin Wilson pointed out, mostly does outstanding work. But they have a tendency to jump with both feet into moral panics. In this case, it’s the current debate about rape culture. Ten years ago, it was obesity. They put out a deeply flawed study that overestimated obesity deaths by a factor of 14. They quickly admitted their screwup but … guess which number has been quoted for the last decade on obesity policy?)

You might ask why I’m on about this. Surely any number of rapes is too many. The reason I wanted to talk about this, apart from my hatred of bogus studies, is that data influences policy. If you claim that 1.3 million women are being raped every year, that’s going to result in a set of policy decisions that are likely to be very damaging and do very little to address the real problem.

If you want a stat that means something, try this one: the incidence of sexual violence has fallen 85% over the last 30 years. That is from the FBI’s NCVS data so even if they are over- or under-estimating the amount of sexual violence, the differential is meaningful. That data tells you something useful: that whatever we are doing to fight rape culture, it is working. Greater awareness, pushing back against blaming the victim, changes to federal and state laws, changes to the emphasis of attorneys general’s offices and the rise of internet pornography have all been cited as contributors to this trend.

That’s why it’s important to push back against bogus stats on rape. Because they conceal the most important stat; the one that is the most useful guide for future policy and points the way toward ending rape culture.

The Pending Crash or How to Play with Scales:

Yesterday morning, I saw a chart claiming that the recent stock market trends are an eerie parallel of the run-up to the 1929 crash. I was immediately suspicious because, even if the data were accurate, we see this sort of crap all the time. There are a million people who have made a million bucks on Wall Street claiming to pattern match trends in the stock market. They make huge predictions, just like the Facebook study above. And those predictions are always wrong. Because, again, the out of sample data contains the real leverage.

This graph is even worse than that, though. As Quartz points out, the graph makers used two different y-axes. In one, the the 1928-29 rise of the stock market was a near doubling. In the other, the 2013-4 rise was an increase of about 25%. When you scale them appropriately, the similarity vanishes. Or, alternatively, the pending “crash” would be just an erasure of that 25% gain.

I’ve seen this quite a bit and it’s beginning to annoy me. Zoomed-in graphs of narrow ranges of the y-axis are used to draw dramatic conclusions about … whatever you want. This week, it’s the stock market. Next week, it’s global warming skeptics looking at little spikes on a 10-year temperature plot instead of big trends on a 150-year one. The week after, it will be inequality data. Here is one from Piketty and Saez, which tracks wealth gains for the rich against everyone else. Their conclusion might be accurate but the plot is useless because it is scaled to intervals of $5 million. So even if the bottom 90% were doing better, even if their income was doubling, it wouldn’t show up on the graph.

Mathematical Malpractice Watch: Et Tu, Reason?

Sunday, June 30th, 2013

Oh, no, not you, Best Magazine on the Planet:

The growth of federal regulations over the past six decades has cut U.S. economic growth by an average of 2 percentage points per year, according to a new study in the Journal of Economic Growth. As a result, the average American household receives about $277,000 less annually than it would have gotten in the absence of six decades of accumulated regulations—a median household income of $330,000 instead of the $53,000 we get now.

You know, I hate it when people play games with numbers and I won’t put up with it from my side. I agree with Reason’s general point that we are over-regulated and badly regulated and that it is hurting our economy. Even the most conservative estimates indicate that bad regulation is sucking hundreds of billions out of the economy — and that’s accounting for the positive effects of regulation.

But the claim that we would be four times richer if it weren’t for regulation is garbage. As Bailey notes in the article, the growth in the US economy over the last half century has been about 3.2 percent. Without regulation, according to this study, it would have been 5.2, which is far higher than the US has ever had over any extended period of time, even before the progressive era. And because that wild over-estimate is exponential, it results in an economy that would be four times what we have now; four times what any large country would have now. The hypothetical US would be as wealthy, relative the real US, as the real US is to Serbia. Does anyone really think that without regulation we would be producing four times as much goods and services?

Even if we assume that we could produce an ideally regulated society, regulation is not the only limit on the economy. Other factors — birth rate, immigration, war, business cycles, education, technological progress, social unrest and the economic success of other countries — play a factor. A perfectly regulated society would most likely move from a position where its growth was limited by regulation to a position where its growth was limited by other factors (assuming this is not already the case)

The paper is very long and complicated so I can’t dissect where their economic model goes wrong. But I will point out that no country in history, including the United States, has ever had half a century of 5% economic growth. Even countries with far less regulation and far more economic freedom than we have do not show the kind of explosive growth they project. In the absence of any real-life example showing that regulatory restraint can produce this kind of growth, we can’t accept numbers that are so ridiculous.

Other studies, as Reason notes, estimate the impact of regulation as being something like 10-20% of our economy. That would require that regulation knock down our economic growth by 0.3% per year, which seems much more reasonable.

(H/T: Maggie McNeill, although she might not like where I went with this one.)

Saturday Linkorama

Sunday, June 23rd, 2013
  • This visualization of the Right of Spring is seriously seriously cool. Seeing the music like that, you start hearing the subtleties that elude you when you just hear it. This is one of the reasons I like to see classical music in performance. There is so much more going on than the ear can take in.
  • This map of linguistic divides in the United States, is something I could spend an entire post on. I match most of the pronunciations from Georgia except for “lawyer” and “pajamas”.
  • This story, about charities that just exist to raise money, should be getting national attention. It’s a disgrace.
  • I’ve used some of these.
  • Roman concrete was apparently better than the shit we’re using.
  • I think this is more or less true: the financial industry has stopped being about enabling economic progress and more about itself. When engineers can make more moving piles of money around than inventing things, we’ve got a problem.
  • Teenage boys killed the sex scene.
  • Baseball Player Salaries

    Monday, April 15th, 2013

    You know, I thought these articles had gone out of fashion:

    In 1972, the year I became aware of baseball, its highest-paid player, Hank Aaron, earned $200,000 per season—the equivalent of around $1 million today. Aaron’s salary was 18 times the median household income in the United States. This year’s highest-paid player, Alex Rodriguez, stands to earn $29 million, which is 580 times the median income. (In fairness, Verlander may be a more egregious example of inequality than Rodriguez, since he pitches in the nation’s poorest big city. In the first year of his new contract, Verlander will earn $20 million—around 800 times as much as Detroit’s median household income.)
    Advertisement

    Over the past 40 years—the period of rising economic inequality that former Slate columnist Timothy Noah called “The Great Divergence”—Americans’ incomes have not grown at all, in real dollars. But baseball players’ incomes have increased twentyfold in real dollars: the average major-league salary in 2012 was $3,213,479. The income gap between ballplayers and their fans closely resembles the rising gap between CEOs and their employees, which grew during the same period from roughly 25-to-1 to 380-to-1.

    As baseball players accumulate plutocratic riches (Rodriguez will have earned a third of $1 billion by the time his contract expires), I find myself wondering why I’m supposed to cheer for a guy earning $27.5 million a year—he’s already a winner. When I was 11, I hero-worshipped the Tigers’ shortstop because I could imagine growing up to take his place. Obviously, that’s not going to happen now. Since my past two jobs disappeared in the Great Recession, I can’t watch a professional sporting event without thinking, Most of those guys are set for life, while I’ve been buying my own health insurance for 5 1/2 years. Paying to see a baseball game feels like paying to see a tax lawyer argue in federal court or a commodities trader work the floor of the Mercantile Exchange. They’re getting rich out there, but how am I profiting from the experience? I know we’re never going back to the days when Willie Mays lived in Harlem and sold cars in the offseason, but the market forces that have overvalued ballplayers’ skills while devaluing mine have made it impossible for me to just enjoy the damn game.

    McClelland even criticizes the Seitz decision, thinking players would be better off if they were bound for life to one team. Or, actually … I don’t think he cares about the players. What seems to be damaged here is a deranged sense of economic justice.

    I shouldn’t bother but … I’m in a fish-in-barrel kind of mood.

    First, let’s consider the point made by honest liberal Matt Yglesias: owners will price tickets, concessions and TV for as much as they can get. There is a myth the media like to promulgate (and MLB owners like to hear) that high player salaries drive high prices for games. This is baloney. The owners will charge whatever they can. When was the last time a team dumped payroll and then cut prices? I remember when Peter Angelos was on Baltimore radio flogging this myth. Someone called up and asked if he was going to cut prices now that the Orioles had dumped all their expensive players. He didn’t have an answer.

    All that free agency has done is give players a bigger piece of the pie — a pie that they actually baked since no one ever payed a plugged nickel to see an owner (and it’s not like the owners are struggling). Frankly, I wish more businesses were following their example and bumping up salaries.

    A few more things to factor in: athletes are taxed at very high rates; they typically only play for a few years, if that; most of those that do reach the highest levels have pursued it with a single-minded devotion. They will have to live on those earnings for a long time. Frankly, if equity is what you’re worried about, I’d spend more time flogging the low salaries of minor league players compared to their MLB counterparts.

    The Slate readers are actually pretty savvy and make many of these points in the comments. However, you do get the occasional “why do we pay teachers and fireman so little and ball players so much!” This was always my favorite argument against high player salaries because it is so obviously absurd. At any given sporting event, an average of 30,000 people show up, buying tickets and concessions. They put in a significant amount of effort and money to watch someone like Justin Verlander pitch. How many teachers teach to 30,000 students at a time? If a teacher could teach that many 162 times a year, would she not be paid like Justin Verlander? The fact is that the skills needed to teach — patience, intelligence, hard work, empathy — are thankfully common. There are literally a few million people doing it. The skills needed to fight fires or fight wars — self-sacrifice, strength, courage — are also thankfully common. The skills needed to be a Cy Young winner — while having less value in an objective sense — are much more rare.

    Yes, it’s true that Justin Verlander can’t teach a class or fight a fire or do astrophysics for that matter. It’s also true that I can’t hit a curveball. So what?

    But doesn’t the huge amount of money spent on sports show that we have our priorities out of whack? Shouldn’t we spend more on education that we do on baseball? Well … we do. Major league baseball made $7.5 billion last year or about $10 for all 75 million people who went to a game and considerably less for those who watched it on television. We spent approximately $800 billion on education — over $10,000 per child in public schools. The difference is the number of people into whose hands that money is concentrated — three million teachers against a thousand athletes. If our devotion to a cause is judged by the how much we spend, how much we worry, how much we argue and how many people devote decades of their lives to it, education is far, far more valued in this country than all sports combined.

    So, no, I don’t think athletes are paid too much. I think they are paid what they are worth. The market has not “overvalued” ballplayers nor has it “undervalued” writers. There are maybe a few hundred people in the entire world who can play baseball at a professional level. But there are millions who could write poorly reasoned articles that drip with wealth envy.

    A final thought: my enthusiasm for sports bothered me a little bit when I was younger. Surely, I thought, I shouldn’t devote so much thought to such a trivial pursuit. Is not Shakespeare worth ten pennants? I departed from that thought when I realized that one can pursue all interests: Shakespeare, astrophysics, sports and, um, blogging. But it was actually Jonathan Swift who converted me, with his compelling argument that a truly enlightened race (the Houyhnhnms) would, once they had beaten down the necessities of nature, devote themselves to the pursuit of both mental and physical excellence. Whether it is writing, playing piano, measuring stars or hitting baseballs, the pursuit of a craft, the perfection of it the pinnacle of possibility — that is what drives us as a race.

    When I watch a baseball game, I see Justin Verlander throw a ball 100 mph with the right spin to make it move just enough to be almost impossible to hit. I see Albert Pujols, in a split second, decide to swing and launch the bat into the precise position to hit the ball as hard as possible. I see Austin Jackson, at the crack of the bat, take off and pursue it into the gap at just the right angle that he can spear it with his outstretched arm. Every game, I see something that should be impossible but isn’t.

    Isn’t that worth $10 a head?

    Friday Linkorama

    Thursday, August 23rd, 2012

    Long-form

  • I encountered this problem with my own child. Some pediatricians are simply obsessed with child growth charts, even to the point of stupidity. We had one pediatrician — who we quickly dumped — freak out because Abby was supposedly way too short for age. It turned out they’d put her height in as centimeters instead of inches. It was simply bizarre watching this medical professional insist that our daughter, one of the tallest in her class, was dangerously short. We quickly switched to one who uses the charts for reference but is not defined by them.
  • The most telling part of this story, about Iran banning women from certain college majors, is the note that Iranian women were massively outperforming their male counterparts. Can’t have that, can we?! Looks like the Islamists are figuring out what the Communists did: when you educate a person, they are halfway to freedom.
  • I’m of two minds about peoples who have not contacted civilization. On the one hand, I don’t like forcing civilization on people. On the other, there seems a bit of condescension in the “don’t disturb their culture” mentality.
  • This article, in which Megan McArdle argues that we like to be conned, seems dead accurate to me. Gregg Easterbrook has made the same argument. Bubbles don’t happen because people are stupid. Bubbles happen because people are greedy. They know, deep down, it’s an illusion; but they keep hoping the roof won’t cave in on them.
  • Weekend Linkorama

    Sunday, August 5th, 2012

    I’m doing more long-form posting of links I care to comment on. But here’s a few I don’t have time for.

  • Man, do I love time lapse video
  • .

  • I haven’t found a good handle on the contention that Mitt Romney’s CEO background is actually a minus. I really think the CEO thing is irrelevant. What concerns me more is his lading up his staff with former Bush people.
  • I’m a little dubious of the contention that trash correlates with economic health. The graph smacks to me of a manipulated stat (it measure the derivative not the absolute). And our push on durability and recycling could confused it. Really, it looks, to me, more like you have one big correlated dip in both stats that’s driving the supposed correlation. The collapse of 2008 was unique. I’m not sure it’s a trend.
  • Sunday Linkorama

    Sunday, February 26th, 2012
  • Now this is cool. A plant is brought back to life after 30,000 years. I once wrote a very cliched short story about a human having the same thing happen; being woken up millennia after our extinction by intelligent insects.
  • Continuing in that vein: let’s go back 298 million years.
  • I knew that kids understood words at much younger ages than we thought. They’re sorta like cats: they just can’t be bothered to talk back until they need something.
  • Mathematical Malpractic Watch: the financial crisis. They have one outlier data point. And it seems much more likely that men move back in with their families because they economy is in the shitter, not the other way around.
  • A wonderful note about overcoming racism and Sidney Pottier.
  • An amazing story about a man surviving two months in the snow.
  • This graph-laden article is probably one of the more intelligent analyses I’ve read of the trends in marriage in our society. Long story short? People are still getting married; they’re just waiting longer. That’s not entirely a bad thing.
  • Wednesday Linkorama

    Thursday, June 2nd, 2011

    Thanks to Twitter siphoning off my political rants, you’re getting more … non-political links:

  • Cracked debunks the Twitter revolution. I’m forced to mostly agree. Social networking may have played a minor role in the upheavals in the Middle East, at best. But real activism involves risking your life, not turning your Facebook profile green.
  • I really really like this idea of the Billion Price Index as a complement to traditional inflation metrics.
  • Do you know … do either of you have any idea of how fucking glad I am I don’t have a big ass commute anymore? I can’t imagine how I did it for so long.
  • I really hope the anti-homework agenda catches on. What’s being done to kids these days is absurd busy work bullshit.
  • So do you think studies like this will, in any way, slow down those who want to ban fatty foods?
  • Political links:

  • Experts are once again stunned that poverty does not cause crime. They seem to be stunned by this quite a lot.
  • Want to stimulate the economy? Wonder how America can lead the world in innovation again? Repeal SOX.
  • Keynes vs. Hayek, Round 2

    Thursday, April 28th, 2011

    As McArdle said, Atlas Shrugged would have been a massive hit had it been half this good.

    If you missed part 1, try here.