Tuesday, June 27, 2006


Over the past few weeks we have created a large number of signatures:

Current stats:
Spam signatures - 2022
Spam image signatures - 3417



At July 10, 2006 2:48 AM, Blogger Rob McEwen said...

Hi. I appreciate what you are trying to do... and I did fill out the sign-up form to participate.

But in my testing, I've discovered that about your 3,000+ signatures, only about 5 of these are effective.... but don't get me wrong... those 5 MD5 sigs are VERY effective.

I don't mean to "poor cold water" on your idea... but the basic problem is that just about every single stock image spam is now sent by bots which manipulate the images for EACH INDIVIDUAL e-mail... rendering each with it's own unique MD5 checksum.

Therefore, running MD5s from these is a complete waste of time and resources because you'll never see that same image in a subsequent spam.

However, in contrast, there are several series of "pill" spams out there which DO use the SAME exact image over the course of several days or weeks... and, yes, your system **is** VERY effective against these.

The upside to all of this is simply that if one were to run an instance of ClamAV with ONLY a handful of MD5 definitions (and with its regular virus defs purposely omitted), the load time and scan time is practically zero... making this a very effective and efficient filter against these series of pill spams I mentioned.

To double-check all of this, I suggest that you find some way of discovering **which** rules "hit" subsequent spams and prune out the ones that don't "hit" a single spam within X number of days. I think you are going to surprised how few rules survive.

Rob McEwen

At September 12, 2006 11:36 AM, Blogger MSRBL said...

We get feedback through people submitting "virus notifications" on the MSRBL site, but currently we don’t have enough people submitting to cut down the numbers reliably without losing signatures that are working for other people.

I do understand that a large number of spam images contain random content and these create "wasted" md5 entries in the database, but I prefer to have these entries on the off chance they are used again (which I have seen).



