Spam
What can we do About Spam?
Unsubscribe functions don't work (in
fact spammers just use them to verify your email address) and blocking senders
just means they send from a different address next time. We need something
better than these tools, we need our email to be filtered. The problem with
filters is that they make mistakes. Spammers don't add a special �SPAM� tag to
the beginning of their emails. But filters have to work on the assumption that
there is differences between your spam emails and non-spam emails.
A Short and less that complete
history of Anti-Spam methods
The first was Blacklists, this was a variation on
the block sender Idea basicly it sets up a blocked sender list that every one
can use, so instead of everyone having to block the spammers address only one
person has to block it. This has some major shortcomings, firstly it doesn't
catch any new spam. If the spammer changes his address then until someone blocks
it hrou(tgh the blacklist) then you will still receive emails from the spammer.
The second shortcoming is that it fails to take into account the fact that spam
is individual. While you might like to receive the 10000 ways to make money
weekly, I might not. So what happens is a valid newsletter gets blocked by
someone and everyone has to unblock it to get their mail. Blacklists can be
useful as a supplement to other techniques though and are by no means dead.
An example a Blacklists is Open Relay DataBase:-
http://www.ordb.org/
The next Idea was to use rules to detect spam, for
example if an email contains the words �Get Money Quick� then it is likely to be
a spam email. Rule based filters make a big list of rules for detecting spam.
Once again the problem is that everyone's email is different. If you are the
president of Get Money Quick International, You don't want your filter to mark
all mail containing Get Money Quick as spam. Of course cases as simple as these
are easy deal with, but the concept remains.
These all have still have merits but
the latest (and currently best) thing is Bayesian
filters. These filters work by learning what type of emails you want to receive
and which you don't. They assign to probabilities to each word, the probability
that it will be in a spam email and the probability that is will be in a good
email. It then looks at all the words adds up the probabilities and if it
reaches a certain threshold of spamminess then it marks it as a spam email. The
main problem with this technique is that is it requires you to train your
filter. When it marks an email wrong you have to tell the program you got it
wrong. It then readjusts itself to the new email and will catch all emails like
it. The best example of this currently is POPFile :- http://popfile.sourceforge.net/ Another up and coming bayesisan filter
is K9 :- http://keir.net/k9.html K9 is still being developed but
it has great prospects as a excellent windows based Bayesian
filter.
Check out my other spam article "Spam What Can We
Do?"
|