Artificial Intelligence and Sports

February 8th, 2007

A couple of days ago Indianapolis won the Superbowl – just as predicted by an Electronic Arts Simulation. The simulation software had been fed with the latest data about all the players involved and they had the game AI fight it out. In the past some of the simulated outcomes were not that close to the final scores, but they still did a fairly decent job in 2005 and 2007.

There is more and more statistical decision making in baseball as well, the most famous example being the Miami-Orlando series in the 1997 playoffs.

Interesting…

Computer Security and Psychology

February 2nd, 2007

Bruce Schneier gave a speech of how human psychology affects computer security. Very true as security software is often too cumbersome to use. Email encryption is still not common place while SSL as an end-to-end encryption is. It’s easy to use and people have been trained to look for that little golden padlock in the corner before entering their credit-card. Yet I feel that there are a couple of things that could be done to encourage people to pay more attention when it comes to computer security related things. In my opinion this isn’t happening because:

  1. Most people are good and assume that other people are good too. They hold the door open for the guy that left his badge in the car, they click on the “cool link”, they open email that looks like it might be from someone important.
  2. Most people see security problems as something that happens to someone else. Most breaches are never publicized, some publicized breaches are so huge (millions of credit card number copied – yet nothing happens to them or anybody they know) – this enhances the belief in the low likelihood of problems. We feel save in a crowd.
  3. Most people believe they know what they are doing. Some other people are pretty learning-resistant when it comes to computers. I’ve heard some stories from companies in which the IT-staff is supposed to do user-training as well in addition to the external training the people received in the beginning (try to get accounting to explain to you over and over again how to file reimbursement claims). Maybe we really need a computer-drivers-test, but then again drunk driving can kill people while drunk computing can not.
  4. People get bored. Cry Wolf too often, ask a person to be careful too many times in the face of a relatively low-probability event and they become trained to click “Yes, I’m sure.” (This will be interesting with Windows Vista) We are constantly bombarded with awareness-programs which makes the IT-security awareness compete with many other awareness-programs.
  5. There is no incentive. Most people (employees) don’t face consequences when their PC is infected or the company database gets stolen. People have the neighbors kid come over to remove all the spyware from the machine and so on. Avoidable security problems like spyware turn into a “car maintenance problem”.

I think on the incentive side there is a lot that can be done. In the industry a lot experience has been gained with safety incentive programs to reduce accidents. I found a study cited on a website where it states that the reinforcing safe of acts “removes the unwanted side effects with discipline and the use of penalties; it increases the employees’ job satisfaction; it enhances the relationship between the supervisor and employees” (McAfee and Winn 1989). Properly designed incentives have the approval of the people to whom they are addressed, and are often preferred to other forms of safety motivation such as laws and policing. Probably some incentives could be created to educate the users and teach them safer computer practices. For example, to make people think more carefully about following links in email (phishing!) one could send fake phishing emails; if the user clicks on a link he gets on a page that informs him that this could have been trap and to always enter the URL directly into the browser address bar. It’s possible to track who clicked and who didn’t with specially crafted URLs in the emails. Similar things could be done with harmless executable attachments. I think this is a direction that should be pursued.

 

Fixing a broken Linksys WRT54G

January 29th, 2007

A friend of mine recently fried one of his linksys router, a WRT54G (hardware version 2.0), after trying to upgrade the firmware. The box is old, no more warranty and all that. Since I played a bit with eWRT linux on the Linksys a while ago, he thought I might have use for a broken router (maybe as a paper-weight). Turns out the power light was blinking forever, but the routers firmware didn’t come up. I recall having seen some documents on the web on fixing a broken Linksys WRT54G firmware (search for “unbrick wrt54g”; that took me a while to find). Here’s what worked for me using linux. First, download the matching firmware for your router from the linksys website. Then I pressed the reset-button, plugged in the power (holding the reset-button down) and kept holding the reset-button down for about 5-6 seconds, and then started the process below (i.e. I typed all that beforehand, just hitting enter for the put command). The router will be on IP 192.168.1.1 and will accept firmware updates with TFTP. It seems that even is the case without boot_wait being set to on.

ifconfig eth0 down
ifconfig eth0 up 192.168.1.100
# clear your local firewall rules if you have to!

tftp 192.168.1.1
tftp> mode binary
tftp> rexmt 1
tftp> trace
tftp> timeout 300
tftp> put code.bin

You might have to try several times to get the timing right. You can also check with tcpdump if you get arp-replys/pings back from the router.

Nasty McAfee bug

January 20th, 2007

Both Tim and Michelle (XP SP2) have the McAfee firewall and virus-scanner installed on their machines. Both their machines came up with a little dialog-box upon boot up today requesting to please connect the machine to the Internet right now to verify the subscription. Clicking cancel will result in an “are you sure” question and upon confirmation (i.e.: “yes, verify the subscription some other time”) it disabled the firewall and the virus-scanner (the little M-icon in the tray turns black). I didn’t notice it at first. You have got to be kidding me! Just because the software can’t check for newer virus-signatures it shouldn’t be disabling the virus-scanner or the firewall. Especially not the firewall as that probably won’t be have to be kept up-to-date. You can re-enable both by clicking your way through the security center, but I wonder how many machines on the Internet right now are left without protection….

Sequential Sampling and Machine Learning

January 8th, 2007

In order to estimate an unknown quantity mu a common approach is to design an experiment that results in a random variable Z distributed within the interval [0,1]. The expectation E[Z]=μ can then be estimated by running this experiment independently, averaging the outcomes, and using Monte-Carlo techniques for the estimate. In (Dagum, Karp, Luby and Ross, SIAM Computing,1995) the AA algorithm (“Approximation Algorithm”) is introduced which, given epsilon and delta and independent experiments for the random variable Z, produces an estimate of the mean (or the true expectation) that is within the factor of 1+ε of μ with probability of success of at least 1-δ. Note that there are no distributional assumptions by the algorithm. This has a couple of applications in machine learning, for example in Bayesian Inference, Bayesian Networks and Boosting (Domingo and Watanabe, PAC-KDD, 2000).

The AA algorithm works in three steps. First, the stopping rule computes an initial estimate of the mean. Then, the variance is determined and, in the third step, additional samples are taken to approximate the expectation even further. A small improvement for the stopping rule in step one can be made as follows. The algorithm assumes a non-zero expectation and keeps sampling until the sum of the elements is larger than a constant determined by epsilon and delta (read the paper to see why that works). The problem is that the closer the elements are to zero, the more elements are needed.

Observe that the following holds for the mean:

With that one can improve the stopping rule as follows:

P.S.: To type Greek letters into WordPress use the html named entities such as ε for ε. That took me forever …

Online Dating

January 3rd, 2007

As I’m currently visiting Germany over the winter break, I couldn’t help but notice the advertising for an online dating website here. They spend a lot of money to get this stuff into peoples heads here. I’ve seen some of that stuff advertised in the US (such as match.com and TRUE) so for the hell of it I went and checked out the website. First thing I noticed is that they require you to create an account to see peoples pictures or browse more than a couple of pages in the search results. That, of course, leads to many many stale profiles from people that just want to window-shop and are not really interested in giving it a serious try. To interested parties (i.e. people that pay) this of course might look like there are so many members on the website that it might be worth paying for.

It just helps add to my impression after reading about Bad Experiences with canceling accounts, which gives a not-so-honorable mention to certain US based dating websites. Apparently you can’t just cancel your membership using the website, but have to take a phone-exit interview. Otherwise, your profile will be kept and your credit card will be continue to be charged. It seems that dating websites are forced to keep people active as long as possible (or at least keep up the illusion). The reason for this might be less mean-spirited than one would at first assume. For example, just to have a couple of thousand people in each major city of the US a dating website would have to have roughly 100.000 active members. That is tough to accomplish, esp. given that without the illusion of activity nobody else would join.

With all that said, a friend of mine found his girlfriend through the Denver Personals on Craigslist. It can work.

Lucky me

December 23rd, 2006

This is another example how bad things can turn into good things. Sometimes … I’m spending Christmas in Germany with my parents and I had to move my original return flight date from the 22nd to the 18th. Moving a flight by only 4 days did cost me $100 in “penalty” by the airline – at first I was not happy about this at all. But then again, a blizzard struck two days later, US36/I25 closed, DIA closed down and a few thousand people are stuck in Denver (luckily things are clearing up a bit know so maybe everybody can make it home in time). Not waiting around for a day was worth $100. Lucky me.

Merry Christmas, Happy Holidays to you all!

Just got back from NIPS 2006

December 12th, 2006

Just got back in town from the NIPS conference. I’ve been to a couple of Machine Learning conferences before, but this was my first time at NIPS. A couple of papers were very interesting (you can download them at books.nips.cc) :

  • Manifold Denoising
    Matthias Hein, Markus Maier
  • Fundamental Limitations of Spectral Clustering Methods
    Boaz Nadler, Meirav Galun
  • Learning with Hypergraphs: Clustering, Classification, and Embedding
    Dengyong Zhou, Jiayuan Huang, Bernhard Schoelkopf
  • Recursive Attribute Factoring
    David Cohn, Deepak Verma, Karl Pfleger

However, I found the single-track style of the conference boring at times. My interest in the latest results from fMRIs etc. is low right now, so at times there was nothing to do, but mingle or just do nothing. At ICML there is always at least one conference-track that is interesting to me. The poster sessions at NIPS were very interesting, though.

The workshops were more interesting than the conference. Only the room-sizes were misallocated. Some workshops (the one with the big rooms) were rather empty, and the ones I attended were overcrowded. And, of course, the traditional workshop summarys at the end of the workshop were funny. The ones that stuck out in my mind the most were Man vs. Bird from the Acoustic Processing Workshop and the novel applications for the non-linear dimensionality reduction with their swiss-roll video. I got a few new ideas from the workshops that maybe will work out.

Also there were no T-Shirts. At this years ICML plenty of free t-shirts were given out – unfortunately during the reception, which forced everybody to carry their T-Shirts around during the entire reception (it looked very amusing, though) – yet at NIPS all we got was a mug… 🙂

Last, but not least, I’ve heard about the legendary NIPS partys from my friends and had some high expectations :-). Friday night I attended the GURU-party from Garry’s Unbelievable Research Unit, Saturday was the legendary Gatsby-party. Both partys were rather disappointing, so I actually went to check out the nightlife in Whistler instead. The bars and clubs I found were pretty quiet as well. Uh well… I’ve heard from people that Whistler had less people than last year around that time.

Making the Cisco VPN Client work (Error 51)

November 22nd, 2006

I just helped Michelle get her Cisco VPN Client to work after she got an “Error 51” asking her to ensure that she at least one network adapter enabled (which was the case). The client software wouldn’t even startup to let us configure anything. After a couple of calls to tech-support, finding out that the error isn’t explained in the manual and a re-installations we found the following to work: disable the Firewall and Virus-software (McAfee in that case; make sure your machine is still behind another firewall, e.g. your routers’ firewall), go to the Control Panel > Administrative Tools > Services. Then stop and restart the “Cisco Systems, Inc. VPN Service”. The startup setting should be set to automatic BTW.

I still don’t quite understand why this works (Shouldn’t the client communicate with the service using named pipes? Shouldn’t the firewall be irrelevant for the startup of the client?), but hey…

Please leave a comment if that worked for you; or whatever workaround you found. Thanks.

Ensemble Predictors and Democracy

November 15th, 2006

I just read an interesting article about how society is usually described in science fiction. Turns out that in all circumstances it is about a very hierarchical, military like structure. There are no suggestions as to how a civilian society might work in the future. Consider things like Star Trek where a bunch of officers command a star ship around and the rest of the people just function. The captain is smart, benevolent and there is rarely an abuse of power. No democracy, no voting, little about how the civilian society of the future might function. There are things like Futarchy, but that’s pretty much all I could find in a quick search (and it wasn’t proposed in a SciFi-novel so it can’t be any good 🙂 ). One of the problems with Democracy might be that people don’t always make the right decision as they don’t have access to all the information or are easily swayed by bad arguments (e.g. negative ads – some of them are just factually wrong). My point is that there haven’t been that many viable alternatives proposed, not even some crazy, outlandish suggestions (think teleportation for means of transport) to give people some new ways to think about this.

There is an interesting book out there called The Wisdom of Crowds. It proposes that large crowds of people can be capable of making decissions better than individuals. Long story short, according to the book four key qualities are necessary to make a crowd smart. The crowd needs to be diverse, so that people are bringing different pieces of information to the table. It must not have somebody at the top dictating the crowd’s answer, and summarize people’s opinions into one collective verdict. The people in the crowd need to be independent, so that they pay attention mostly to their own information, and not worrying about what everyone around them thinks (i.e. being immune to persuasion concepts like social proof).

Random Forests in machine learning are an ensemble method that has very good classification performance. The way it works is that hundreds of decision trees are build, but each on a different training set and with a different choice of features. If all the classifiers are strong (i.e. not able to make perfect predictions, but they tend to do the right thing – they take the information they have and make independent decisions) , then the overall vote of all the trees in the ensemble will tend to minimize the misclassification error. Breiman gave a mathematical proof of why this minimizes the classification error (i.e. bad decisions).

I wonder if something like this might work for political decision making. Leaving problems like corruption and other human fallacies (e.g. looking at what others are doing) aside for a moment and assuming that for all things there are good arguments to be made for and against a bill, a senators vote would depend on how he or she weights the particular arguments for and against the bill. If we assume that senators tend to vote for what they perceive to be the right thing, would giving each senator a random subset of information make the overall senate vote for the “right thing”? Another idea would be to make a political decision, similar to jury duty, by picking a large number of people from the general population at random and have them decide on a particular issue.
Edit:I found some criticism of the Wisdom-of-Crowds theory, such as Wikipedia not being accurate enough or a democracy electing people like Hitler. A good question in both cases would be if people made their decisions independently in these cases or not. I think that independent decisions are difficult to achieve in practice. Also one has to wonder how robust this system is due to the assumption that everybody makes the best decission they can.