Friday, July 24, 2009

Classifying Exciting and Boring

My summer research job is all about classifying things. We got this big pile of events, and I want to classify them as Exciting and Boring. So what I do is I go through the events, one by one, inspect them, and then either toss them into the Exciting pile or the Boring pile. Okay, so it's not really that simple. I fiddle around a lot with Receiver Operating Characteristic Curves and Operating Points, and False Alarm Rates and Efficiency, and a whole bunch of other Concepts Which I Capitalize Ironically. (The CWICI will only get worse as I go on.)

Which is all to say, my thinking as of late has been colored by the science of classification. The other day, I classified some pens into the Out Of Ink or Just Needs To Be Shaken categories. Then I classified the different vegetables in my vegetable soup as Too Much or Not Enough. I took a list of books I wanted and classified them as In The Bookstore or Needs To Be Ordered Online.

Yeah, so I'm not actually doing all this ('twas a joke), but you get my point get that I have a point.

One thing you realize in the science of classification is that you will always get some of them wrong. Some of those events which I classified as Exciting will turn out to be false alarms. Some events which I classified as Boring are in fact Exciting. In fact, it's sort of arbitrary. I can always choose to be pickier, so that there will be less false alarms, but I'll miss more of the Exciting events. Or I can be less picky, accepting more false alarms, but being more sure that I'll catch all the Exciting events.

The tricky thing is, we don't really know how many events are actually Exciting. That's what we're trying to figure out! Maybe none of them are Exciting, and all I'm looking at are false alarms. How would I know? What's the False Alarm Rate when nothing is truly Exciting?

In my research, we have a very complex and precise way of estimating the False Alarm Rate.* But sometimes, we are not so lucky to have such a method available to us. So we make do with slightly less precise arguments.

*It probably involves time travel and robots.

For instance! UFO sightings: how many of them are Exciting Aliens, and how many are Overexcited Earthlings? UFOlogists assert that the False Alarm Rate is low, and that at least some of them are Excited Aliens. Skeptics assert that the sightings are well within the False Alarm Rate. How do we know which it is? Well, astronomers tend to inspect the sky much more often and more carefully than lay people. So astronomers would tend to have a smaller False Alarm Rate, while at the same time catching much more of the Exciting Aliens if they indeed exist.

There are far, far fewer UFO sightings among astronomers than among lay people. This indicates that UFO sightings are well within the False Alarm Rate, and there may be no Exciting Aliens to speak of. But hey, all these Overexcited Earthlings are pretty cool and interesting in their own way.

Of course, I stole this argument from Phil Plait, the Bad Astronomer. And then I made it a whole lot more arcane and technical. Fun times.