UCSC is holding a Starcraft AI competition. I wish I had the time to participate… Starcraft is one of my all time favorite games, and writing a better AI for a real-time strategy game is certainly interesting and challenging.
Archive for the ‘Artificial Intelligence (AI)’ Category
Found this on aigamedev and some of them are really hilarious: AI game bugs caught on tape
Two interesting articles in Science: The Automation of Science is about a robotic system that autonomously generated functional genomic hypotheses about a yeast. The second article, Distilling Free-Form Natural Laws from Experimental Data, is about a system learning from physics experiments and deriving a hypothesis from the data (this is along the lines of the general idea I’ve written about in the past). Cool stuff.
However, a few possibilities suggested in the article might be worth exploring. For example, they suggest Bayesian heuristics to identify threats. Using machine learning techniques might be a direction worth exploring. IBM AntiVirus (maybe not the current version anymore) has been using Neural Networks with 4Byte sequences (n-grams) for bootsector virus detection.
A couple things to keep in mind, though:
- Quality of the classifier (detection rate) should be measured with Area-under-ROC-Curve (AUC), not error-rate like most people tend to do in Spam-Filter comparisons. The base-rate of the “non-virus” class is pretty high; I have over 10.000 executables/libraries on my windows machine. All (most?) of them non-malicious.
- The tricky part with that is the feature extraction. While sequences of bytes or strings extracted from a binary might be a good start, advanced features like call-graphs or imported API-calls should be used as well. This is pretty tricky and time-consuming, especially when it has to be done for different types of executables (Windows scripts, x86-EXE files, .Net files etc.). De-obfuscation techniques, just like in the signature based scanners, will probably be necessary before the features can be extracted.
- Behavior blocking and sandboxes are probably easier, a better short-term fix, and more pro-active. This has been my experience with email-based attacks as well back in the Mydoom days when a special mime-type auto-executed an attachment in Outlook. Interestingly there are only two programs out there that sanitize emails (check mime-types, headers, rename executable attachments etc.) at the gateway-level – a much better pro-active approach than simply detecting known threats. The first is Mimedefang, a sendmail plugin. The other is impsec, based on procmail. CU Boulder was using impsec to help keep student’s machines clean (there were scalability issues with the procmail solution, though).
“All models are wrong, and increasingly you can succeed without them” — George Box
“Sometimes…” — Me
In a Wired article about the Peta-byte age of data processing the author claimed that given the enormous amounts of data and the patterns found by data mining we are less and less dependent on scientific theory. This has been strongly disputed (see Why the cloud cannot obscure the Scientific Method) as the author simply ignores the fact that all the patterns that were found are not necessarily exploitable – finding a group of genes that interact is a first step, but won’t cure cancer. However, in machine translation or placing advertising online one can succeed with little to no domain knowledge. That is, once somebody comes up with the right features to use (see Choosing the right features for Data Mining).
What would be interesting to develop, however, is a “meta-learning” algorithm that can abstract from simpler models and learn e.g. a differential equation. For example, lets take data from several hundred Physics experiments about heat-distribution conducted on different surfaces etc. We can probably learn a regression model for one particular experiment which could predict how the heat will distribute given the parameters of the experiment (material, surface etc.). The meta-learning algorithm would then look at these models and somehow come up with the heat-equation. That would be something…
I found the following article interesting: http://www.overcomingbias.com/2007/11/artificial-addi.html
Laura just pointed me to this system, best described as:
I have a routine problem that sometimes paper titles are not enough to tell me what papers to read in recent conferences, and I often do not have time to read abstracts fully. This collection of scripts is designed to help alleviate the problem. Essentially, what it will do is compare what papers you like to cite with what new papers are citing. High overlap means the paper is probably relevant to you. Sure there are counter-examples, but overall I have found it useful (eg., it has suggested papers to me that are interesting that I would otherwise have missed). Of course, you should also read through titles since that is a somewhat orthogonal source of information.
I have the same problem. And wow… I will have a lot to read this weekend.
Captchas are these little word-puzzles in images that web-sites use to keep spammers and bots out. They are everywhere and even the New York Times had an article about Captchas recently. It turns out it’s a nice exercise in applying some machine learning to break these things (with lots of image manipulation to clean up the images). Since spam-bots are becoming smarter, people are switching to new kinds of Captchas. My favorites (using images) so far are Kittenauth and a 3D-rendered word-captcha.
Just got back home from AISTATS (Artificial Intelligence and Statistics). The conference was really interesting (more so than NIPS) and it’s unfortunate that it is only every two years. Some of the invited talks were way over my head, but I learned a lot from other people’s work and got new ideas …
Some of the coolest papers were (incomplete list and in no particular order; I need to organize my notes But there were way more papers of interest to me than at NIPS):
- Nonlinear Dimensionality Reduction as Information Retrieval
- Venna Jarkko and Samuel Kaski
- Fast Low-Rank Semidefinite Programming for Embedding and Clustering
- Brian Kulis, Arun Surendran, and John C. Platt
- Local and global sparse Gaussian process approximations
- Edward Snelson, Zoubin Ghahramani
- A fast algorithm for learning large scale preference relations
- Vikas Raykar, Ramani Duraiswami, and Balaji Krishnapuram
- Deep Belief Networks
- Ruslan Salakhutdinov and Geoff Hinton
- Large-Margin Classification in Banach Spaces
- Ricky Der and Daniel Lee
One thing that couldn’t help but notice was how much research is now focusing on Semi-Definite Programs, either for dimensionality reduction or other purposes. Yet, there are not many efficient ways to compute SDPs. One paper presented a method based on quasi-Newton gradient descent, but it’s probably not good enough yet for large problems.
Other interesting papers I saw was about the unsupervised deep belief nets that learns a structure of the data which results in an interesting performance boost. The authors train a deep belief net (unsupervised) on the data and then train classifiers on the output; although all the results were compared to only linear techniques, they showed some impressive results. This reminded me of a similar idea I had a while ago that I never got to work; I tried to use label propagation methods to approximate a kernel matrix usable for SVMs and the like. It never worked, because my algorithm caused the SVMs to always overfit (despite being unsupervised – it took me a while to realize that doing something unsupervised is no guarantee that you won’t overfit your data). I’ll investigate some day what made all the difference in this case…
Another interesting bit was that approximating the Matrix Inverse by low-rank approximations leads to significant loss of accuracy for Gaussian Processes Error bars. This should be interesting for further research in the speedups for these and other algorithms that require a matrix inversion (e.g. semi-supervised label propagation algorithms).
I just read an article in the Wired blog titled “AI Cited for Unlicensed Practice of Law” citing a ruling from a court upholding its decision that the owner through the expert system he developed has given unlicensed legal advise. While an expert system is a clear cut case (as the system always does exactly what it was told [minus errors in the rules]; it just follows given rules and makes logical conclusions), this becomes more interesting in cases in which the machine learns or otherwise modifies its behavior over time. For example, lets say I put an AI software online that interacts with people and learns over time. Should I be held responsible if the program does something bad? What if I was not the person that taught it that particular behavior? This will probably be a topic that the courts will have to figure out in the future. For one, people should not be able to hide behind actions their computer has done. But what if it is reasonably beyond the capability of the individual to forsee what the AI has done?
This will probably end up being the next big challenge for courts just like the internet has been. It is interesting how the internet has created legal problems just with people being able to communicate more easily with each other: think trademark issues, advertising restrictions for tobacco or copyright violations (fair use differs from country to country; what is legal in one might be illegal in another) …
Update: And it just started. Check out this article: Colorado Woman Sues To Hold Web Crawlers To Contracts