Tuesday, January 30, 2007


Counting the homeless

New York City is in the process of taking a count of its homeless population. Of course, that's not as easy as taking a count of homeful people, where you just stop at each house. For the homeless (or maybe the feds would prefer, now, to call them "people with very low roof security") one has to go to the places where they hang out, where they sleep, where they find meals... and with the homeless one doesn't always know where those places are.

There's been criticism from various advocacy groups that the city's counts seriously understate the problem — that they're very inaccurate, and result in numbers that are way too low. The city, in its turn, says that these are just estimates and it doesn't really matter if they're too low. That's as may be, but when you think about it you realize that there's little value to a count whose accuracy is that uncertain.

I heard on the radio yesterday afternoon that in an attempt to make the count more accurate, the city will be placing “decoys” for the city's counters to count — people who will appear to be homeless, but who are not.


That was my first thought on hearing that. But then Mayor Bloomberg explained: they can use the count of the decoys to adjust the count of the homeless. If they know the percentage of decoys that were missed, they can scale the count of the true homeless accordingly, to compensate for mis-counting.

An interesting idea (with an unfortunate name, likening the homeless to ducks being hunted, but never mind that). OK, let me think about it some more:

  1. In order for this to work at all, the decoys must be invisible with respect to the counters — that is, the counters can't know which are the decoys and which are the real homeless. Otherwise, the presence of the decoys will skew the count.
  2. It seems that the counters have to be pretty much invisible to the homeless. A good portion of the homeless population would otherwise hide, suspicious of or frightened by the counters. The counters have to be low-key, and can't go accost all the homeless people.
  3. Number 2 means that the decoys can't reliably know whether or not they've been counted, to report that fact later. Yet in order for this to do anything, the city needs to have an accurate count of how many decoys were counted and how many were missed. I don't see how they can do that with any accuracy.
  4. This mechanism, by its design, can fix accidental errors for situations where the counting parameters are known. “Oops, no one went down 53rd St,” can be corrected for. The critics are not concerned about these sorts of errors; they're worried that the counters simply don't check certain places because they aren't aware that the homeless congregate there, or they're concerned for their own safety. In those cases there won't be decoys there either, and this mechanism will have no way to compensate for those situations.
  5. Expanding on number 4, this mechanism's accuracy is fundamentally related to the extent to which the answer is already known. The proportion of decoys in a given area must approximate the proportion of real homeless in that area in order for the scaling to do any good. If you're trying to count Hasidic Jews and you send out a load of “decoys” wearing dark suits and hats, you'd better send lots of them to Bensonhurst and only a few to Greenwich Village. If you do it the other way around, your decoys will way overstate the error in the Village and understate it in Bensonhurst. It's like that: if you don't already know what areas the homeless tend to be in, you don't know where to send the decoys.

So here's my thought after hearing the explanation and doing a little analysis of it:


1 comment:

johnny phenothiazine said...

I don't understand how the decoys are supposed to know how to act.

When you're counting people who live in houses, it's easy to start with a very accurate count of the houses themselves, then by extrapolating the data obtained from the fraction of the people who were at home so they answered the door, and the further fraction of them who filled out the census-takers's forms, you can get a fairly good estimate of the entire population of people who live in houses.

But when you try to count the homeless you're not going to start off with a solid datum like the true count of houses. You'll have to do your polling in some irregular public places, like parks or street-corners. So imagine you're doing your polling in parks. The two factors you'd need to know, in order to scale your polling up to cover the entire homeless population are: first, how many hours of the day does an average homeless person spend in a park, and second, what percentage of the homeless population, which includes a lot of people with outstanding arrest warrants as well as many people who are generally anti-social if not outright paranoid, will deliberately avoid any official-looking person carrying a clipboard asking all kinds of prying questions.

If you knew those two factors then sending out so-called "decoys" to hang out in parks would be a valid way of double-checking the census-takers's extrapolated estimates. But who can supply statisticians with even an order-of-magnitude estimate of the ratio of those two scaling factors, in order to train the "decoys" how to behave as though they were real homeless people? The only solid result I can see is to determine the lower limit case where the hang-out time is 24 hours a day and the reluctance factor is zero.