You are currently browsing the archives for the Artificial Intelligence (AI) category.
| M | T | W | T | F | S | S |
|---|---|---|---|---|---|---|
| « Apr | ||||||
| 1 | 2 | 3 | 4 | |||
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| 19 | 20 | 21 | 22 | 23 | 24 | 25 |
| 26 | 27 | 28 | 29 | 30 | 31 | |
- Advertising (1)
- Artificial Intelligence (AI) (8)
- Coding / Programming (6)
- Cryptography (1)
- Data Mining (10)
- ewrt linux (2)
- Fixing Stuff (5)
- Machine Learning (18)
- Math (1)
- Politics (2)
- Psychology (3)
- Ramblings (18)
- Random (6)
- Security (11)
- Society (9)
- Sociology (3)
- spam (2)
- Statistics (9)
- April 21, 2008 1:38 am: ART OF SEDUCTION: Not Pretty, Really
- March 25, 2008 2:25 am: "Internal Server Error" when converting phpBB v2 to phpBB v3
- March 6, 2008 1:29 am: Firewire and DRM
- February 28, 2008 10:46 pm: Using Psychological Domain Knowledge for the Netflix Challenge
- February 12, 2008 1:24 am: VPN Tunels from within VMWare (Windows XP and GRE weirdness)
- February 2, 2008 5:59 pm: License Key Copy Protection
- January 8, 2008 8:34 pm: Registering Domains with Network Solutions
- January 7, 2008 10:22 pm: Joe-job ...
- December 11, 2007 1:37 am: Back from NIPS 2007
- November 24, 2007 1:03 am: GMail Logout Strangeness
Blogroll
Useful Links
Archive for the Artificial Intelligence (AI) Category
Artificial Addition (Overcoming BIAS)
November 23, 2007 2:44 pm by Markus.
I found the following article interesting: http://www.overcomingbias.com/2007/11/artificial-addi.html
Posted in Artificial Intelligence (AI) | Print | No Comments »
What Machine Learning Papers to read …
July 13, 2007 1:08 pm by Markus.
Laura just pointed me to this system, best described as:
I have a routine problem that sometimes paper titles are not enough to tell me what papers to read in recent conferences, and I often do not have time to read abstracts fully. This collection of scripts is designed to help alleviate the problem. Essentially, what it will do is compare what papers you like to cite with what new papers are citing. High overlap means the paper is probably relevant to you. Sure there are counter-examples, but overall I have found it useful (eg., it has suggested papers to me that are interesting that I would otherwise have missed). Of course, you should also read through titles since that is a somewhat orthogonal source of information.
http://www.cs.utah.edu/~hal/WhatToSee/
I have the same problem. And wow… I will have a lot to read this weekend.
Posted in Statistics, Machine Learning, Artificial Intelligence (AI) | Print | No Comments »
Interesting Experimental Captchas
June 11, 2007 3:11 pm by Markus.
Captchas are these little word-puzzles in images that web-sites use to keep spammers and bots out. They are everywhere and even the New York Times had an article about Captchas recently. It turns out it’s a nice exercise in applying some machine learning to break these things (with lots of image manipulation to clean up the images). Since spam-bots are becoming smarter, people are switching to new kinds of Captchas. My favorites (using images) so far are Kittenauth and a 3D-rendered word-captcha.
Posted in spam, Machine Learning, Artificial Intelligence (AI), Security | Print | No Comments »
Back from AISTATS 2007
March 27, 2007 12:38 am by Markus.
Just got back home from AISTATS (Artificial Intelligence and Statistics). The conference was really interesting (more so than NIPS) and it’s unfortunate that it is only every two years. Some of the invited talks were way over my head, but I learned a lot from other people’s work and got new ideas …
Some of the coolest papers were (incomplete list and in no particular order; I need to organize my notes
But there were way more papers of interest to me than at NIPS):
- Nonlinear Dimensionality Reduction as Information Retrieval
- Venna Jarkko and Samuel Kaski
- Fast Low-Rank Semidefinite Programming for Embedding and Clustering
- Brian Kulis, Arun Surendran, and John C. Platt
- Local and global sparse Gaussian process approximations
- Edward Snelson, Zoubin Ghahramani
- A fast algorithm for learning large scale preference relations
- Vikas Raykar, Ramani Duraiswami, and Balaji Krishnapuram
- Deep Belief Networks
- Ruslan Salakhutdinov and Geoff Hinton
- Large-Margin Classification in Banach Spaces
- Ricky Der and Daniel Lee
One thing that couldn’t help but notice was how much research is now focusing on Semi-Definite Programs, either for dimensionality reduction or other purposes. Yet, there are not many efficient ways to compute SDPs. One paper presented a method based on quasi-Newton gradient descent, but it’s probably not good enough yet for large problems.
Other interesting papers I saw was about the unsupervised deep belief nets that learns a structure of the data which results in an interesting performance boost. The authors train a deep belief net (unsupervised) on the data and then train classifiers on the output; although all the results were compared to only linear techniques, they showed some impressive results. This reminded me of a similar idea I had a while ago that I never got to work; I tried to use label propagation methods to approximate a kernel matrix usable for SVMs and the like. It never worked, because my algorithm caused the SVMs to always overfit (despite being unsupervised - it took me a while to realize that doing something unsupervised is no guarantee that you won’t overfit your data). I’ll investigate some day what made all the difference in this case…
Another interesting bit was that approximating the Matrix Inverse by low-rank approximations leads to significant loss of accuracy for Gaussian Processes Error bars. This should be interesting for further research in the speedups for these and other algorithms that require a matrix inversion (e.g. semi-supervised label propagation algorithms).
Posted in Statistics, Machine Learning, Artificial Intelligence (AI) | Print | No Comments »
Artificial Intelligence Cited for Unlicensed Practice of Law
March 8, 2007 5:52 pm by Markus.
I just read an article in the Wired blog titled “AI Cited for Unlicensed Practice of Law” citing a ruling from a court upholding its decision that the owner through the expert system he developed has given unlicensed legal advise. While an expert system is a clear cut case (as the system always does exactly what it was told [minus errors in the rules]; it just follows given rules and makes logical conclusions), this becomes more interesting in cases in which the machine learns or otherwise modifies its behavior over time. For example, lets say I put an AI software online that interacts with people and learns over time. Should I be held responsible if the program does something bad? What if I was not the person that taught it that particular behavior? This will probably be a topic that the courts will have to figure out in the future. For one, people should not be able to hide behind actions their computer has done. But what if it is reasonably beyond the capability of the individual to forsee what the AI has done?
This will probably end up being the next big challenge for courts just like the internet has been. It is interesting how the internet has created legal problems just with people being able to communicate more easily with each other: think trademark issues, advertising restrictions for tobacco or copyright violations (fair use differs from country to country; what is legal in one might be illegal in another) …
Update: And it just started. Check out this article: Colorado Woman Sues To Hold Web Crawlers To Contracts
Posted in Society, Artificial Intelligence (AI), Ramblings | Print | No Comments »
EM-Clustering when double isn’t enough…
February 13, 2007 6:18 pm by Markus.
One of the more interesting developments in clustering (in my opinion) is clustering of data the is on a unit hypersphere. It sounds like some rare special case at first, but appears quite frequently in real life applications such as Spectral Embeddings in Spectral Clustering, some subdomains of Bio-Informatics data or text-data in TFIDF representation. The data can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings in the normalization step that constitute an example of directional data and can result in different shapes on a hypersphere.
The first paper published that suggested a good clustering algorithm presented an Expectation Maximization (EM) algorithm for the von Mises-Fisher distribution (Banerjee et.al, JMLR (6) 2005). Avleen and myself started to work on extension for this that utilizes the Watson distribution, a distribution for directional data that has more modeling capability than the simple von Mises-Fisher distribution. We just published our results for the Watson EM clustering algorithm in the AISTATS 2007 conference to be held in March (Matlab code will be available soon).
One problem with both algorithms is that they require a high precission number representation in order to work well for high-dimensional problems such as bio-informatics data and text. Most prior work with directional data was limited to maybe 3 dimensional cases, and most Kummer-function approximations (another problem we had to address) work well only for the lower dimensional cases. In our AISTATS paper we only presented results for lower dimensional embeddings as we had some problems getting it to work for higher dimensional data (also, the root-solver that was involved is just incapable of handling larger problems). We have been working on a speedup with some success, but I have to say that it was mostly the numerical problems that gave us a hard time.
More and more Machine Learning techniques require a more careful consideration of numerical problems (Support Vector Machines, my manifold clustering algorithm etc.) and I run into numerical problems every other day. While trying to improve our Watson-EM algorithm I found out that Continued Fractions have many desirable properties such as the unique, finite representation of rational numbers. Numbers can be represented exact with no numerical error. In the EM algorithm we use them to approximate the Kummer function. Maybe a more exact number representation for fractions can be made out of this?
So I started looking and found in an tech-report that the number representation of continued fractions can be nicely implemented in Prolog. It also explains how to add up numbers in a Continued Fraction Representation and so on with an arbitrary precision.
I haven’t found any papers yet that suggest a suitable hardware implementation of Continued Fractions to replace the IEEE floating point numbers we use nowadays, but it can probably be done.
Posted in Math, Machine Learning, Artificial Intelligence (AI) | Print | No Comments »
Artificial Intelligence and Sports
February 8, 2007 3:59 pm by Markus.
A couple of days ago Indianapolis won the Superbowl - just as predicted by an Electronic Arts Simulation. The simulation software had been fed with the latest data about all the players involved and they had the game AI fight it out. In the past some of the simulated outcomes were not that close to the final scores, but they still did a fairly decent job in 2005 and 2007.
There is more and more statistical decision making in baseball as well, the most famous example being the Miami-Orlando series in the 1997 playoffs.
Interesting…
Posted in Statistics, Data Mining, Machine Learning, Artificial Intelligence (AI) | Print | No Comments »
Data mining used to find new materials
August 27, 2006 6:56 pm by Markus.
I just read an Eureka Alert (see also ZDNet’s blog)mentioning that a couple of researchers at MIT found new, potentially useful crystal structures with AI and Data Mining techniques. You can find the abstract of their paper here. I’ve seen randomness and Genetic Algorithms around alot lately (such as the self-reconfigurable-modular-robot/) and a robot that can do bioinformatics experiments (DNA sequencing) all by himself (link?). I think that this is a very useful application of AI. However, it is only an application of the scientific knowledge. It’s fast testing based on the current physical models and insights. It automates science to an extend, but does not come up with new insights. It’s more data without more people to add an interpretation. For example, it took a few years before somebody found an application for Teflon.
I haven’t seen this around (will search again), but what would be really interesting is an algorithm that can form a new hypothesis (e.g. a differential equation) based on outcomes from Physics experiments. An algorithms that explains the data and forms a theory. It’s probably harder to build than regression algorithms…
Posted in Data Mining, Machine Learning, Artificial Intelligence (AI), Ramblings | Print | No Comments »