Comments for Markus Breitenbach

Comment on The GPL and Machine Learning Software – Should the GPL cover training data? by Fred Mailhot

Fred Mailhot — Tue, 31 Jul 2012 19:02:15 +0000

This is an interesting commentary. One thing that’s not clear to me is how licensing affects the subsequent use of data. For example, if I were to use a GPL’ed dataset to train a classifier, is that classifier then considered a derivative work, and would I have to open source it if I were to subsequently distribute it?

Comment on Deploying SAS code in production by Mark

Mark — Thu, 17 Feb 2011 07:05:13 +0000

i am trying to find out if you have done any work on support vector machines on SAS. i am doung research on classification using SAS eguide package. do you have any examples of coding for support vector machines on SAS

Comment on Energy efficient data mining algorithms by Rosemary Hornbrook

Rosemary Hornbrook — Thu, 21 Oct 2010 06:26:21 +0000

Very interesting blog – rare events modeling, data mining, reinforcement learning – all applicable to complex resource management. Can I keep looking?

Comment on GraphLab & Parallel Machine Learning by datakid1

datakid1 — Fri, 30 Jul 2010 16:50:12 +0000

Good one.You might be interested to take a look at the collection of Tutorials and videos on Data mining.
Tutorials: http://www.dataminingtools.net/browsetutorials.php
Videos: http://www.dataminingtools.net/browse.php

Comment on Validating patterns found by Data Mining techniques by Dan

Dan — Wed, 03 Mar 2010 17:24:23 +0000

http://arstechnica.com/science/news/2010/03/were-so-good-at-medical-studies-that-most-of-them-are-wrong.ars

Comment on Alternative measures to the AUC for rare-event prognostic models by Markus

Markus — Fri, 26 Feb 2010 17:40:09 +0000

The x-axis is just the score-value. The plot is supposed to show that the two classes overlap and can not be perfectly separated by the classifier.

Comment on Alternative measures to the AUC for rare-event prognostic models by steffen

steffen — Fri, 26 Feb 2010 10:47:44 +0000

Thanks for sharing this case study (including your thoughts). I have messed around with AUC and the Calibration-Refinement / Sharpness – measures also.

A conclusion from various papers was, that Logistic Regression in general delivers well calibrated probabilities. Hence it was interesting to see the opposite can happen when only a small base rate is given.

Side Note A: I did not understand the risk-score density plot. What is the x-axis ?

Side Note B: http://home.comcast.net/~tom.fawcett/public_html/papers/ROC101.pdf is an excellent study of roc and auc in general.

thanks again for this post. Blogs with technical and detailed data mining content are hard to find.

Steffen

Comment on Strong profiling is not mathematically optimal for discovering rare malfeasors (on rare event detection) by Dan

Dan — Fri, 22 Jan 2010 21:48:21 +0000

The same paper is being discussed by the BBC: http://news.bbc.co.uk/2/hi/uk_news/magazine/8452260.stm

Comment on Automation of Science by Dave

Dave — Tue, 19 Jan 2010 21:38:43 +0000

Older, but still pretty interesting: “A new theorem in particle physics enabled by machine discovery”.

http://dx.doi.org/10.1016/0004-3702(95)00128-X

The note reports a novel finding in particle physics that was enabled by a machine discovery program.

Comment on Torpedo-Reviews in Machine Learning Conferences by Markus

Markus — Mon, 21 Sep 2009 04:09:18 +0000

This is hilarious:
http://th.informatik.uni-mannheim.de/People/Lucks/reject.pdf
Rejection letters for some of the most influential papers written in Computer Science.