Exterminating Form Spam
In 2005, we launched a web application for our campus that allows our users, especially those with no technical knowledge, to produce web forms.
Why did we do this? Mostly, we did it because everyone always wanted a form and my group had to build them all. We had been using the ancient FormMail.pl but each receipient had to be approved and each form hand-coded with required fields. I wanted users to be able to create forms, have the results emailed to them as well as saved in a database, and manage those forms, all without having to get the web team involved.
I know, web forms aren’t sexy. Not in the least, but they’re a critical part of how people communicate with us on our sites. Since it’s launch, FormBuilder (original name, I know) has really made an impact across campus. Forms are all uniform in terms of style and layout. This was a huge problem, as everyone, myself included, was building forms differently. Offices on campus can create a form in just a few minutes, email the address or post it on the web and start getting responses in minutes. These offices have seem a dramatic improvement in student responses and program attendance.
So FormBuilder’s been chugging along with no problems, until recently when it’s been getting hammered with spam. Not all forms are getting hit, just a lucky few. They are receving, seriously, hundreds of submissions a day. Luckily, it’s mostly gibberish and not pr0n spam, but still, it’s annoying for my users and it’s using my resources up. Not cool.
I wrestled for a long time with how to stop the spam. I thought about adding some kind of question that would be appended to each form, such as “What is 2+2,” or something to that effect. I thought about using code like Bad Behavior, but I don’t know if that would be easily defeated.
In the end, I decided to implement the dreaded CAPTCHA.
I looked at code to generate my own and do all the processing on my server. I struggled with getting them to be readable and getting them to fit in with the look and feel of our forms. After running into so many problems, I decided to use the reCAPTCHA service.
reCaptcha was developed by Carnegie Mellon University, and, in addition to reducing spam, the project helps digitize books from the Internet Archive. In my eyes, that’s a win-win. ReCaptcha allows users to reload the images if they are tough to read, and they also allow for users to hear a series of numbers that they enter instead of words. Listen to the numbers sometime, it’s a little creepy.
ReCaptcha is being used on a great deal of large websites, including Twitter, StumbleUpon and Ticketmaster, to name but a few. I’m sure you’ve seen the red reCaptcha boxes as you’ve surfed the web.
Implementing reCaptcha was painless. They offer libraries in a variety of languages and detailed instructions. I used the PHP code and it’s worked perfectly. What really drew me to the service is the fact that you can really customize the look and feel of the captcha to match your color scheme.
Here’s a standard reCaptcha box:

Here’s an example from one of our FormBuilder powered forms:
Earlier this week, we rolled this out on all FormBuilder-powered forms. It was smooth and other then a call to our computing help desk by a user who feared we’d been hacked, we haven’t heard any issues from people filling out forms or from our campus users.
Thus far, the spamming has stopped and only legitimate form entries are getting through. Of course, it will only be a matter of time until hackers beat ReCaptcha, and the whole cat and mouse game will start again.
If you enjoyed this post, please subscribe to my RSS feed!
Comments
4 Responses to “Exterminating Form Spam”
Leave a Reply


Thanks for the article Mike - glad you’ve had some success! Is that FormBuilder as in VeerWest formbuilder?
I’ve seen a lot of discussion recently about the effectiveness & accessibility of CAPTCHA tests. One article definitely worth reading is at http://www.sitepoint.com/article/captcha-problems-alternatives, where they argue that determined hackers will always find a way around this kind of authentication test, & that heuristics testing (e.g. Akismet) may be the way forward.
Any thoughts on this?
In my opinion, CAPTCHAs are a “rock and a hard place” kind of idea. They can be effective against spammers, but are increasingly frustrating for real users. But as it happens, real users are the ones you care the most about.
Isn’t it more important to take care of these users, even at the risk of a bit of spam?
Meanwhile, there are lots of little tricks that knock out the majority of spam submissions, such as hidden (or display:none) fields - scripts don’t know any better and fill them in, so you can eliminate any that have that field filled. Or possibly relying on Javascript methods for submission, which most scripts will not support and parse.
[...] left this comment in response to HighEdWebTech’s post, Exterminating Form Spam and decided it was worth sharing here. In my opinion, CAPTCHAs are a “rock and a hard [...]
[...] in June, I blogged about using ReCaptcha on the majority of the forms at my school. We’ve got a centralized [...]