Archive for August, 2013

Big Data, Predictive Policing and the Tyranny/Anarchy Trade-Off

Saturday, August 17th, 2013

Bloomberg had an interesting article today about a talk on the implications and privacy trade-offs of predictive policing and profiling. Jim Adler’s “felon classifier” is also described in his blog.

Basically, he built a classifier predicting from some innocuous (but possibly correlated variables) the likelihood of somebody having a felony offense. The classifier isn’t meant to be used in practice (from eye-balling the Precision/Recall curve in the talk slides, I estimate an AUC of about 0.6-ish; not too great), but it was built to start a discussion. It turns out that courts have upheld the use of profiling in some cases as “reasonable suspicion,” a legal standard for the police to stop somebody and investigate. This could lead to “predictive policing” being taken even further in the future. Due to the model outputting a score Jim also discusses the trade-off of where the prediction of such a model may be actionable – he calls it the┬áTyranny/Anarchy Trade-Off (a catchy name ­čÖé

Having done statistical work in criminal justice before, I think predictive analysis can be helpful in many areas of policing and criminal justice in general (e.g., parole supervision). On the other hand, I find profiling and supporting a “reasonable suspicion” from statistical models unconvincing. I think the courts will have to figure out a minimum reliability standard for such predictors, and hopefully they’ll set the threshold far higher than what the ‘felony classifier’ is producing. There’s just too many ways using a statistical model for “reasonable suspicion” to go wrong. Even if variables of protected classes (gender, ethnicity, etc.) are not used directly, there may be correlated variables (hair-color, income, geographic area) as discussed in the talk Jim gave. Even more problematic in my mind would be variables that do not or hardly ever change, as they would lead to the same people being hassled over and over again. Also the training data from which these models are built is biased since everybody in it by definition has been arrested before. It’s beyond me how one can correct for this sample bias in a reliable way. Frankly, I don’t think policing by profiling (statistical or otherwise) can be done well, and hopefully courts will recognize that eventually.