Saturday, July 12, 2008


That's a mean median

I was struck by something in the lede of a recent NY Times article, headlined “Minimum Wage Increases Faster Than Median Wage”:

In the last few years, the minimum wage in New York State has increased almost 40 percent, while the average pay for hourly workers has risen much more slowly, not even keeping pace with inflation, according to a report released Thursday by the federal Department of Labor.

The median wage paid to the 4.1 million hourly workers in the state was $12.03 last year, meaning that more than two million New Yorkers earned less than that, the report from the Bureau of Labor Statistics showed. That was about equal to the median national hourly wage of $11.95 — about $25,000 a year for a 40-hour work week.

The headline and the rest of the article say “median”, but the lede says “average”.

Another word for “average” is “mean”, and they are not the same as “median”. To get the average, you sum all the values and divide by the number of values. To get the median, you list all the values in ascending order, and take the middle one. One major difference is that a few very high (or very low) values will skew the average (mean), but will not affect the median. Reports on house values and salaries usually use the median for that reason.

To take an extreme example, suppose the values are these:

1, 1, 2, 2, 2, 2, 690

The median value is 2, in bold, (the one in the middle of the list of seven items), but the average is 100 (the list sums to 700, then divide by 7). So if those represent salary levels (say, take-home pay in thousands of dollars a month), it’s clear that the median gives a better view of the real situation than the average does.

Of course, it’s possible for the median to produce a strange result too. Another extreme example:

1, 1, 2, 3, 100, 800, 893, 900, 900

Here, the median value is 100 and the average is 400. In this case, both give a rather weird view of the data, but the average is probably more meaningful than the median, if either can be said to have much meaning at all. (And that’s why there are things in statistics like deviation and skew.)

But people often don’t understand the difference between the two terms, and incorrectly use them interchangeably. And for most people, it doesn’t matter: they get the idea, and don’t care about the statistical details.

The New York Times should get it right, though.

[I did point this out to the reporter, who told me that he knows the difference, and that the error in the lede was introduced by an editor, who changed it before it went up.]


Dale said...

This is always a good clarification to make -- depending on the data, sometimes average tells us something interesting, sometimes median tells us something interesting, and sometimes minimum and maximum tell us something. And so on.

There's a lot of game-playing with "averages" in economics reporting. Think of what happens to the "average" wealth at a dinner party when Bill Gates shows up!

J.D. Fisher said...

The median can be described (or referred to) as an average.

You're right, obviously, that "mean" is not the same as "median," but both can be described using the term "average."

Efrique said...

While a median can be called an average, if someone says average, they almost universaly mean the arithmetic mean. If I say "what's the average?", that's what people will almost universally think I am asking for.

As an example, while one might expect "mean()" to give the mean in Excel, it is in fact given by "average()" (I keep typing mean(), which is more often what works in statistics packages).

On the other hand, "mean" can mean all manner of different things - not just the arithmetic mean - there are geometric means, harmonic means and various other forms of generalized means...