Wednesday, February 16, 2011


Watson’s second day

Commenting on yesterday’s entry, The Ridger notes this:

I find looking at the second-choice answers quite fascinating. "Porcupine" for what stiffens a hedgehog’s bristles, for instance. There is no way that would be a human’s second choice (after keratin). Watson is clearly getting to the answers by a different route than we do.

That’s one way to look at it, and clearly it’s true that Watson goes about determining answers very differently from the way humans do — Watson can’t reason, and it’s all about very sophisticated statistical associations.

Consider that both humans (in addition to this one, at home) got the Final Jeopardy question with no problem, in seconds... but Watson had no idea (and, unfortunately, we didn’t get to see the top-three analysis that we saw in the first two rounds). My guess is that the question (the answer) was worded in a manner that made it very difficult for the computer to pick out the important bits. It also didn’t understand the category, choosing Toronto in the category U.S. Cities, which I find odd (that doesn’t seem a hard category for Watson to suss).

But another way to look at it is that a human wouldn’t have any second choice for some of these questions, but Watson always does (as well as a third), by definition (well, or by programming). In the case of the hedgehog question that The Ridger mentions, keratin had 99% confidence, porcupine had 36%, and fur had 8%. To call fur a real third choice is kind of silly, as it was so distant that it only showed up because something had to be third.

But even the second choice was well below the buzz-in threshold. That it was as high as it was, at 36% confidence, does, indeed, show Watson’s different thought process — there’s a high correlation between hedgehog and porcupine, along with the other words in the clue. Nevertheless, Watson’s analysis correctly pushed that well down in the answer bin as it pulled out the correct answer at nearly 100% confidence.

In fact, I think most adult humans do run the word porcupine through their heads in the process of solving this one. It’s just that they rule it out so quickly that it doesn’t even register as a possibility. That sort of reasoning is beyond what Watson can do. In that sense it’s behaving like a child, who might just leave porcupine as a candidate answer, lacking the knowledge and experience to toss it.

No one will be mistaking a computer for a human any time soon, though Watson probably is the closest we’ve come to something that could pass the Turing test. However good it can do at Jeopardy! — and from the perspective of points, it’s doing fabulously (and note how skilled it was at pulling all three Daily Doubles) — it would quickly fall on its avatar-face if we actually tried to converse with it.

No comments: