Mar
5
2008

5 Ways To Catch And Prevent Website Form Spam – Part 1

Share
Email

I used to get so much spam in my mailbox everyday that I was spending more time deleting it then I was adding new content, maintaining sites, or starting new sites combined (typical story right). Some of your website spam is not preventable, or, it is preventable, but at the cost of usability and/or good information getting caught in the “bad stuff” as well.

No doubt, if you’re serious about web development you’re going to be creating and interacting with forms on a regular basis. Especially with this new web 2.0 attitude every has…gee wiz, what’s that?

CAPTCHA
CAPTCHA is the number one method most people are using to prevent spam today. What is it?

“A CAPTCHA is a challenge response test used on computers to check if the user is human. A common kind of CAPTCHA that is used on websites requires that the visitor type the letters and numbers of a distorted image. This method is based on the fact that is difficult for computers to extract the text from the image while it is very easy for humans.” – captchacreator.com

Here’s a few resources to scripts so you’ll be able to implement that functionality into your own forms:

Escape Data – Prevent SQL Injections
Capcha is usless if you’re leaving your form unprotected from attack. Remember to filter post strings with addslashes() or mysql_real_escape_string().

Also take a look at these functions for other similar kind of checks that might be useful:

  • htmlspecialchars() – Escapes the following characters: &,’,”,>,< ...that is the ampersand, single quote, double quote, less than, and greater than symbols.
  • strip_tags() – Strips out all HTML and PHP code from the given string.
  • htmlentities() – Converts ALL characters to their HTML entities equivalent (this is a more catch all version of htmlspecialchars).
  • urlencode() – Encodes the URL to pass strings on a GET method. As I mentioned, don’t use GET with forms. This is useful when you’re passing user input variables for other reasons.

Remember, once you convert data like this using one of the above functions, you can undo it for readability and output by using its reverse function. urlencode() for example has urldecode() to undo its actions so you can begin using the string as you would have before the encode.

Check Request Method
Throw an if statement around your existing form processor that checks to see if data is coming from the globals post variable. If not, then the user is not accessing your form the way you designed it to be used.

  1. <?php
  2. if ($_SERVER['REQUEST_METHOD'] == 'POST') {
  3. //typical form processes
  4. } else {
  5. echo "The form can not be used like that";
  6. }
  7. ?>

Change this to GET if you’re using that instead. However, I would high recommend you never use that method as it is more insecure.

Check Request Source
You should also check to see if the request is originating from your own server. This is a very common method of form abuse which doesn’t necessarily mean you’ll be receiving spam. If you’re server is getting a lot of bounced emails to its default email you may have someone abusing your site in this manner.

  1. <?php
  2. $source = $_SERVER['HTTP_HOST'];
  3. //or if you want to detect just the domain you can use a regular expression to filter it.
  4. $source = ereg_replace("^(www.)?([^.]+).[^.]+$", "\\2",$_SERVER['HTTP_HOST']);
  5. if ($source !== "robmalon.com") {
  6. echo "you are illegally accessing this script";
  7. } else {
  8. //typical form processes
  9. }
  10.  
  11. //The referrer should also be from your own domain...Likewise, if there is no referral then the user obviously isnt using your form correctly (so we dont have to check for that).
  12. //Note: stristr() searches for the first occurrence of a string inside another string.
  13. if(stristr(getenv("HTTP_REFERER"),$source)) {
  14. //typical form processes
  15. } else {
  16. echo "you are illegally accessing this script";
  17. }
  18. ?>

Using Regular Expressions For Data Validation
I like to check data using preg_match (or any of the regular expression functions). This method kills a lot of birds with one stone. Why write separate functions to check if a string is empty, then another if it allows numbers, and another if it allows alpha characters, and another to specific field length, and another…you get the point.

if (!preg_match(“/^[A-z0-9]{5,15}$/”, $name)) $error .= “<li class=\”errors\”>The Name field can only contain letters and numbers (no spaces) and can only be up to 15 characters long.</li>”;

In the preg_match I am checking that the name field only contains alphanumeric values (case insensitive) and needs to be at least 5 to 15 characters in length. If you don’t meet that specification appropriate text is added to $error.

Using “[variable] [dot][equals] ” in this fashion allows me to keep an ongoing variable that I’m adding to. I can then check to see if $error contains any data before my script does any significant queries. If there are errors you can spit them out by echoing $error and ask the user to correct them. I then use CSS to style my errors which you can see with class=\”errors\”.

Regular expressions defiantly have a learning curve on them but they are one of the best tools that can help simplify your life in a variety of situations. I’ll go into more detail about them down the road but for now you may want to buy a book or do some Googling.

The Non-Technical Recap
-Implement a CAPTCHA script.
-Escape slashes (and other bad characters).
-Check to see if data is coming to your form using the correct method.
-Check to see if the request is originating from your own server.
-Check data using a regular expression.

This is just the tip of the iceberg for what I can tell you about spam. Thats why I’ve decided to make a mini series of it. Over the next couple weeks I will be bringing you more detailed ways I deal with spam. Comment below or email me about your own spam preventative methods. If it is a new or unique technique I will post it in a future blog along with a link back to your blog. Just write in the comments below and include your URL in the website field.

Respond: Leave A Comment | Trackback URL

Entrupeners, Subscribe for the lastest tools, tips, and tutorials.


One Response to 5 Ways To Catch And Prevent Website Form Spam – Part 1

  1. It just occured to me…setting up those if statements with a die() statement might be easier for a quick and dirty method. Which I’m sure most of you at least for now would want a quickfix to just toss in. Keep that in mind…Something like this:

    if ($_SERVER[‘REQUEST_METHOD’] !== ‘POST’) { die(“The form can not be used like that”); }

    You can put that anywhere at the beginning of a script for it to take effect without worrying about existing code too much.

Leave a Reply

Custom Theme by Rob Malon | Content & Design © 2010 - RobMalon.Com - Chicago, Illinois