ISC on the Future of Anti-Virus Protection

Written by

Artificial Intelligence (AI), Classification, Machine Learning, Security

An article on the Internet Storm Center discusses wether Anti-Virus software in the current state is a dead end. In my opinion it has been dead for quite a while now. Apart from the absolutely un-usable state that anti-virus software is in, I think it’s protecting the wrong things. Most attacks (trojans, spyware) nowadays come through web-browser exploits and maybe instant-messenger (see reports on ISC). So instead of scanning incoming emails, how about a behavior blocker for the web-browser and the instant messenger? There are a couple of freeware programs (e.g. IEController [German]) out there that successfully put Internet Explorer, etc. into a sandbox; whatever Javascript exploit – known or unknown – the browser won’t be able to execute arbitrary files or write outside its cache-directory. Why is there nothing like that in the commercial AV packages?

However, a few possibilities suggested in the article might be worth exploring. For example, they suggest Bayesian heuristics to identify threats. Using machine learning techniques might be a direction worth exploring. IBM AntiVirus (maybe not the current version anymore) has been using Neural Networks with 4Byte sequences (n-grams) for bootsector virus detection.

A couple things to keep in mind, though:

Quality of the classifier (detection rate) should be measured with Area-under-ROC-Curve (AUC), not error-rate like most people tend to do in Spam-Filter comparisons. The base-rate of the “non-virus” class is pretty high; I have over 10.000 executables/libraries on my windows machine. All (most?) of them non-malicious.
The tricky part with that is the feature extraction. While sequences of bytes or strings extracted from a binary might be a good start, advanced features like call-graphs or imported API-calls should be used as well. This is pretty tricky and time-consuming, especially when it has to be done for different types of executables (Windows scripts, x86-EXE files, .Net files etc.). De-obfuscation techniques, just like in the signature based scanners, will probably be necessary before the features can be extracted.
Behavior blocking and sandboxes are probably easier, a better short-term fix, and more pro-active. This has been my experience with email-based attacks as well back in the Mydoom days when a special mime-type auto-executed an attachment in Outlook. Interestingly there are only two programs out there that sanitize emails (check mime-types, headers, rename executable attachments etc.) at the gateway-level – a much better pro-active approach than simply detecting known threats. The first is Mimedefang, a sendmail plugin. The other is impsec, based on procmail. CU Boulder was using impsec to help keep student’s machines clean (there were scalability issues with the procmail solution, though).

Comments

2 responses to “ISC on the Future of Anti-Virus Protection”

August 5, 2008 5:08 am

RB

As an aside: Would you be suggesting AUC for evaluating performance on ill-balanced testsets in general? How does this information differ from the common 2×2 matrix of classified vs true labels?

Log in to Reply
August 6, 2008 12:11 am

Markus

Yes, in my opinion using the AUC for ill-balanced datasets represents the true performance the best. It summarizes it all in one number – how do you decide which model is better when comparing 2×2 matrices?

A nice summary into all sorts of measures (AUC, Brier scores etc.) and how to determine how well a model actually works can be found here: http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html#Methods%20for%20dichotomous%20forecasts

Log in to Reply

ISC on the Future of Anti-Virus Protection

Comments

2 responses to “ISC on the Future of Anti-Virus Protection”

Leave a Reply Cancel reply

More posts

The 80% Rule for ML Results

Deferring Decisions to AI

Some Thoughts On Secure Random Numbers

Machine Learning Newsticker