Captcha and the Alternatives
By Alan Wagstaff
- Introduction -
Most of us have probably encountered a Captcha system on the web in the past few years, sometimes without even knowing it.
Captcha systems are designed to filter out automated "spam bots" from posting to websites that accept submissions such as forums and blogs by asking the user to solve a problem. These problems can range from typing in the letters and numbers displayed in an image to answering a simple math question or selecting the correct image.
- Traditional Captcha -
Traditional captcha systems generate an image with a selection of random letters and numbers which the user is then asked to type in to a box.
These images range from simple images such as:
to more advanced images with multicolor backgrounds and text such as:
These captcha systems developed and even made use of other projects such as the reCaptcha system which takes words from old books that have been digitised. When their OCR system is unable to a word, the word is placed into a database which reCaptcha then uses to generate their captcha images. They use one known word and one unknown word in the image so if the user correctly gets the known word then reCaptcha assumes that they probably got the unknown word correct and updates its database.
Whilst these captcha systems worked well, the downsides soon became obvious. It became fairly simple for spam bots to guess at the correct letters and numbers in the simple images using an OCR-type system. As the images became more advanced and began to make use of colors and distortions visually-impaired users where increasingly affected.
Although not widely used, some sites have also made use of animated images as captchas such as:
- Audio Captcha -
Audio captcha began to appear to solve the problem that visually-impaired users where unable to read the captcha images. Audio captures are rarely seen on their own and are usually combined with a traditional image captcha system.
Audio captchas will often distort the voices, use a number of different voices for each letter, and add a lot of background noise in an attempt to fool the new generation of spam bots that have voice recognition features.
- Image Selection Captcha -
A recent an increasingly popular captcha system asks that users select a few images form a larger selection. For example, the user may be presented with nine images of various animals and asked to select the three rabbits.
A good example of this is the KittenAuth system:
These systems also present problems to visual-impaired users just as the traditional captcha systems did. Many visually impaired users rely on alt-tags in images to explain them. For obvious reasons, systems like KittenAuth are not able to use image alt-tags.
- Human Input and Questions Captcha -
Another increasingly popular captcha system involves asking the user to answer a simple math question, or to fill in the missing word in a sentence.
[I fill my bath with hot ______]
[What is the answer to five plus two?]
The main problem with this system is the limited number of questions. Whilst it is perfectly possible to add 10,000 questions to your captcha system, it is more likely that most website administrators will add a lot less. If the owner of a spam bot was determined enough they could constantly refresh your webpage until they have all of your potential questions.
- Conclusion -
As you have seen, there is no perfect captcha system at the moment. All of them have their strengths and weaknesses. Most website administrators (me included) tend to opt for a traditional capture system with an audio alternative. reCaptcha[link] provide an excellent system which also has the added benefit of helping to digitise old books.
- References and Further Reading -
- PHP Resources -