JAMBE Slogan :- Get Wisdom Get Understanding Proverbs 4:5

Sections:

Spam

What can we do About Spam?

Unsubscribe functions don't work (in fact spammers just use them to verify your email address) and blocking senders just means they send from a different address next time. We need something better than these tools, we need our email to be filtered. The problem with filters is that they make mistakes. Spammers don't add a special �SPAM� tag to the beginning of their emails. But filters have to work on the assumption that there is differences between your spam emails and non-spam emails.

A Short and less that complete history of Anti-Spam methods

The first was Blacklists, this was a variation on the block sender Idea basicly it sets up a blocked sender list that every one can use, so instead of everyone having to block the spammers address only one person has to block it. This has some major shortcomings, firstly it doesn't catch any new spam. If the spammer changes his address then until someone blocks it hrou(tgh the blacklist) then you will still receive emails from the spammer. The second shortcoming is that it fails to take into account the fact that spam is individual. While you might like to receive the 10000 ways to make money weekly, I might not. So what happens is a valid newsletter gets blocked by someone and everyone has to unblock it to get their mail. Blacklists can be useful as a supplement to other techniques though and are by no means dead.

An example a Blacklists is Open Relay DataBase:- http://www.ordb.org/

The next Idea was to use rules to detect spam, for example if an email contains the words �Get Money Quick� then it is likely to be a spam email. Rule based filters make a big list of rules for detecting spam. Once again the problem is that everyone's email is different. If you are the president of Get Money Quick International, You don't want your filter to mark all mail containing Get Money Quick as spam. Of course cases as simple as these are easy deal with, but the concept remains.

These all have still have merits but the latest (and currently best) thing is Bayesian filters. These filters work by learning what type of emails you want to receive and which you don't. They assign to probabilities to each word, the probability that it will be in a spam email and the probability that is will be in a good email. It then looks at all the words adds up the probabilities and if it reaches a certain threshold of spamminess then it marks it as a spam email. The main problem with this technique is that is it requires you to train your filter. When it marks an email wrong you have to tell the program you got it wrong. It then readjusts itself to the new email and will catch all emails like it. The best example of this currently is POPFile :- http://popfile.sourceforge.net/ Another up and coming bayesisan filter is K9 :- http://keir.net/k9.html K9 is still being developed but it has great prospects as a excellent windows based Bayesian filter.

Check out my other spam article "Spam What Can We Do?"

The contents of this page, and linked documents are Copyright to James Brunskill and may not be used without permission