Apr
23
2008

Automated Anonymous Spam Filtering and Validation

Share
Email

Filtering spam and putting items in a queue to check on or authorize later is one way to deal with spam. However there is another way to deal with data entries like that and it is a little more automated.

Spammers tend to use fake email addresses which is what this tactic plays on. My thought was to have form output for something like anonymous comments log into a DB (as normal) and then fire off an email provided at the time of entry by the user. The stored row in the database would be marked as inactive until the user clicked on a link in their email address and “approve” their own comment. Clicking that link flicks an “OK” switch (simple DB query) and marks it as active on the site. It then also puts it in a queue to be reviewed by an admin at a later date. This way you can still prune through the posts and snag the rare comment or two that you don’t want displayed on your site. For the most part its an extra line of defense that reduces the urgency of your constant attention while still providing the functionality you desire. This blocks the automated and fake email address types, but since some .

Doing this might be a waste of time if you already have a signup process that requires an email validation check. However many custom applications are allowing users to anonymously post comments on content which are sometimes left wide open for spammers to attack. Your best bet would be to combine this with a CAPTCHA script: Enhanced Number Equation CAPTCHA. Even so, you’ll still get valid visitors that are insist on wasting your time with random nonsense. You may still have to sort through those. Or…

Another good thing to combine with this is to check to see if the user input any kind of “http” or “a href” text. Ultimately you may want to look over this guide for more on that. You can easily add to or adjust a preg_match function to check for this: Automating PHP Forms – Spam Filtering and Data Cleansing. Make sure you check against your fields length too, especially on comment fields.

Telling the user that you are validating their content is helpful:

  1. It lets them know how to enter the information in so that your visitors enter information in quickly and successfully.
  2. It shows the spammers and nonsense type users that they wont be able to EASILY post their junk on your site.

Along that note, displaying the users current IP address next to the submit button (as I have done on the content comments over at mmorpgexposed.com) can also prove as a deterrent from non-web savvy users who are afraid of getting in trouble for their illicit behavior. Using an array of “bad words” can also prove to solve a lot of spam problems when dealing with rowdy visitors.

Conclusions
The overall goal is to categorize your needs within three sets of parameters:

  1. Acceptable posts that should always be automated (but still have a user approved email kickback check).
  2. Questionable posts – The ones in the scenarios I listed above that get flagged.
  3. Posts that are quiet obviously spam and can be immediately discarded. (If you do this let them know so in the rare case a valid user is not getting frustrated with their comments not showing up).

If you do this in advance you’ll have a much easier time implementing it and providing for the best possible time efficiency and functionality when dealing with spam.

Leave a Reply

Your email address will not be published.

*

Custom Theme by Rob Malon | Content & Design © 2010 - RobMalon.Com - Chicago, Illinois