Recently I had a fun discussion with Bill over lunch about intellectual property and how that might apply to statistical modeling work. Given that there are more and more companies making a living from forming predictions with a model they have built (churn-prediction, credit-scores and other risk-models) we were wondering if there were any means of protecting them as intellectual property. For example, the ZETA-model for predicting corporate bankruptcies is a closely guarded secret with having published only the variables being used (Altman E. I. (2000); Predicting financial distress for companies: revisiting the Z-Score and ZETA models). Obviously this model is useful for lending and can make serious money for the user. Making decisions guided by a formula is becoming more popular. This might be something over which legal battles will be fought in the future.

Copyrighted works and patents often count towards what a company would be worth should somebody acquire it. This means there would be motivation for start-up companies to protect their models. A mathematical formula (e.g. a regression equation) cannot be patented, and copyright probably won’t apply either; even if copyright would apply, it’s trivial to build a formula that does essentially the same thing (e.g. multiply all the weights in the formula by 10). This leaves only trade secret protection and means there is no recourse once the cat is out of the bag. Often it’s also the data-collection method that is kept secret – a company called Epagogix developed a method to judge the success of movies from a script by scoring it against some scales that they keep secret.

Currently, I don’t see any legal protections with the exception of trade-secrets for this. And given that there is infinitely many ways to express the same scoring rules in a different way, this would be a fairly hard problem for lawyers and politicians to formulate sensible rules for establishing protection for this kind of intellectual property.