You are currently browsing the Markus Breitenbach weblog archives for the day July 12, 2008 4:41 pm.
- Advertising (1)
- Artificial Intelligence (AI) (10)
- Classification (2)
- Clustering (1)
- Coding / Programming (7)
- Cryptography (1)
- Data Mining (14)
- ewrt linux (2)
- Fixing Stuff (5)
- Machine Learning (24)
- Math (1)
- Politics (3)
- Psychology (3)
- Ramblings (19)
- Random (7)
- Security (14)
- Society (10)
- Sociology (3)
- spam (2)
- Statistics (11)
- November 1, 2008 9:48 pm: Deploying SAS code in production
- October 15, 2008 12:18 am: Photo-based CAPTCHAs
- September 28, 2008 10:27 pm: Computer Models and the Mortgage Crisis
- September 1, 2008 8:19 pm: Can statistical models be intellectual property?
- August 21, 2008 9:17 pm: Taxons, Taxometrics and the Number of Clusters
- August 14, 2008 11:13 pm: CAPTCHAs - Not dead
- August 1, 2008 10:25 pm: ISC on the Future of Anti-Virus Protection
- July 12, 2008 4:41 pm: The cloud obscuring the scientific method
- June 22, 2008 5:05 pm: Debugging and Evaluating Predictive Models
- May 21, 2008 8:08 pm: Cult of the Amateur
Blogroll
Useful Links
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
Archive for July 12, 2008 4:41 pm
The cloud obscuring the scientific method
July 12, 2008 4:41 pm by Markus.
“All models are wrong, and increasingly you can succeed without them” — George Box
“Sometimes…” — Me
In a Wired article about the Peta-byte age of data processing the author claimed that given the enormous amounts of data and the patterns found by data mining we are less and less dependent on scientific theory. This has been strongly disputed (see Why the cloud cannot obscure the Scientific Method) as the author simply ignores the fact that all the patterns that were found are not necessarily exploitable - finding a group of genes that interact is a first step, but won’t cure cancer. However, in machine translation or placing advertising online one can succeed with little to no domain knowledge. That is, once somebody comes up with the right features to use (see Choosing the right features for Data Mining).
What would be interesting to develop, however, is a “meta-learning” algorithm that can abstract from simpler models and learn e.g. a differential equation. For example, lets take data from several hundred Physics experiments about heat-distribution conducted on different surfaces etc. We can probably learn a regression model for one particular experiment which could predict how the heat will distribute given the parameters of the experiment (material, surface etc.). The meta-learning algorithm would then look at these models and somehow come up with the heat-equation. That would be something…
Posted in Machine Learning, Artificial Intelligence (AI) | Print | No Comments »