A conclusion from various papers was, that Logistic Regression in general delivers well calibrated probabilities. Hence it was interesting to see the opposite can happen when only a small base rate is given.

Side Note A: I did not understand the risk-score density plot. What is the x-axis ?

Side Note B: http://home.comcast.net/~tom.fawcett/public_html/papers/ROC101.pdf is an excellent study of roc and auc in general.

thanks again for this post. Blogs with technical and detailed data mining content are hard to find.

Steffen

]]>