By E. E. Lester
Virtually anyone with an email address knows what Spam is, and has, perhaps, considered giving up the speed, convenience, and simplicity of email because of it. Those who have their own websites are more vulnerable than the average person with a single work or home email address from their company or Internet Service Provider. Email addresses visible on a website can quickly become Spam magnets, as automated programs, similar in form to search engine spiders, roam the web, looking for addresses to which new broadsides of Spam may be fired. Website hosting companies generally provide their clients email accounts for use with their domain, but are you with a host that provides those email accounts with Spam and virus protection?
Local Blacklist Filters
Webmasters shouldn’t need to seek out local filters for their site’s contact email addresses. There are a variety of server level solutions a hosting company can offer to protect their users from unwanted emails. A most basic step is provision of a very rudimentary “blacklist” functionality to their users, allowing them to prevent future Spam emails from arriving from the same address. This type of filter is virtually worthless in today’s Spam environment, though, as it is quite rare to see unsophisticated email arriving from the same address multiple times. Spammers have grown far more sophisticated than that. Blacklisting functionality is only really useful in avoiding email from other real people you don’t particularly wish to hear from anymore.
Keywords and Regular Expressions
More advanced server level Spam filters are available. A small advance is accomplished using keyword filters. Keyword filters merely check for instances of a certain string of characters and deny the message if that string if found. The core problem with keyword-only filters is they can “over filter”. Someone who puts “sex” on their keyword filter will find receiving local news and event announcements difficult if they live in a town named “Essex”. Some filters attempt to address this deficiency by using “regular expressions” in order to build a sophisticated rule set to prevent Spam from reaching your inbox. Briefly, regular expressions are syntax rules used to identify certain strings of text or numbers. These rules can be set up to identify text patterns that are commonly used in Spam. They can become quite complex, but, as with most any filtering method, are not 100% bullet proof. Some filters that use regular expressions come with a basic set that can be appended by the user. Obviously this kind of feature is of little use to someone not familiar with regular expressions.
Bayesian Filters
Currently the most sophisticated filtering methods use Bayesian inferences. Bayesian filters take a large data set and determine the probability a message is Spam based on its similarity to previous Spam messages. The more emails that are processed and flagged theoretically make the filter more accurate. Services that provide filtering on an ISP or host level, like Postini’s “SpamAway”, filter billions of emails and provide the highest level of success and fewest “false positives”. SpamAway is already highly intelligent about identifying Spam and doesn’t require any “learning” commands or examples be provided. The online, browser based interface keeps flagged messages in an easily accessible “quarantine” and allows the user to check for any false positives. White list functionality is provided to aide in the prevention of future false positives. A hosting company offering such an advanced service takes Spam and virus filtering for their customers seriously.
About the Author
Mr. Lester served for 4 years as webmaster for ApolloHosting.com and previously worked in the IT industry an additional 5 years. Apollo Hosting provides website hosting, ecommerce hosting, vps hosting, and web design services to a wide range of customers. Click for more hosting articles.