I was struck by something in the lede of a recent NY Times article, headlined “Minimum Wage Increases Faster Than Median Wage”:
In the last few years, the minimum wage in New York State has increased almost 40 percent, while the average pay for hourly workers has risen much more slowly, not even keeping pace with inflation, according to a report released Thursday by the federal Department of Labor.The headline and the rest of the article say “median”, but the lede says “average”.
The median wage paid to the 4.1 million hourly workers in the state was $12.03 last year, meaning that more than two million New Yorkers earned less than that, the report from the Bureau of Labor Statistics showed. That was about equal to the median national hourly wage of $11.95 — about $25,000 a year for a 40-hour work week.
Another word for “average” is “mean”, and they are not the same as “median”. To get the average, you sum all the values and divide by the number of values. To get the median, you list all the values in ascending order, and take the middle one. One major difference is that a few very high (or very low) values will skew the average (mean), but will not affect the median. Reports on house values and salaries usually use the median for that reason.
To take an extreme example, suppose the values are these:
1, 1, 2, 2, 2, 2, 690
The median value is 2, in bold, (the one in the middle of the list of seven items), but the average is 100 (the list sums to 700, then divide by 7). So if those represent salary levels (say, take-home pay in thousands of dollars a month), it’s clear that the median gives a better view of the real situation than the average does.
Of course, it’s possible for the median to produce a strange result too. Another extreme example:
1, 1, 2, 3, 100, 800, 893, 900, 900
Here, the median value is 100 and the average is 400. In this case, both give a rather weird view of the data, but the average is probably more meaningful than the median, if either can be said to have much meaning at all. (And that’s why there are things in statistics like deviation and skew.)
But people often don’t understand the difference between the two terms, and incorrectly use them interchangeably. And for most people, it doesn’t matter: they get the idea, and don’t care about the statistical details.
The New York Times should get it right, though.
[I did point this out to the reporter, who told me that he knows the difference, and that the error in the lede was introduced by an editor, who changed it before it went up.]