The Problem with Google’s reCAPTCHA

The Problem with Google’s reCAPTCHA

Keep up to date with CoreSolutions

The Problem with Google’s reCAPTCHA

The Problem with Google’s reCAPTCHA

Most people know that the main use of Google’s reCAPTCHA software is to stop bots from entering spam messages or brute forcing passwords. What a lot of people don’t know is that every time they solve one of these they are contributing to a massive clandestine crowdsourcing campaign. The problem with reCAPTCHA lies in its ulterior motive which, due to its very nature, has the potential to display inappropriate and possibly even offensive content. The truth is that it’s possible for reCAPTCHA to display pretty much anything and that’s what scares us the most.

So how is it possible that something owned by a company as reputable as Google can display offensive material? Let’s look at how it works. When Google purchased reCAPTCHA in 2009 they began to use the service as a way of converting scanned images of words from books into a digital format. They did this by presenting the user with 2 words: a control word and an unknown word. The control word is known by the system and is used to perform the primary purpose of detecting human presence by matching the known value with the value entered by the user. The other word is unknown, a single word picked from a scanned page of a book or any other source Google chooses. The assumption is that if a person is able to correctly identify the control word they probably also correctly identified the unknown word hence providing Google with the means to digitize entire libraries of books one word at a time.

Herein lies the problem. If unknown content is being displayed on our website by reCAPTCHA, how can we know if it is appropriate? As a business, our website is designed to look professional and the possibility of displaying offensive content to clients is a real problem for us. We didn’t have a cause to be concerned until recently when Google released a new version of reCAPTCHA that introduced a form of human verification that prompts a user to match images to a phrase. This version uses the same principles of unknown and control data (in this case images) and during our initial testing we came upon an image of some inappropriate content.

For this reason we have decided to seek an alternative human verification method that does not have the possibility of displaying inappropriate or offensive content.

If you have any similar experiences or just some comments, let us know in the comments section below. We would love to hear. And, as always, Thanks for Reading.

Comments

Leave a Comment