Industry


Ads by TechWords

See your link here


Seth Weintraub's picture
Seth Weintraub

Apple versus Google

Google acquires reCAPTCHA in two-for-one deal

Google ceremoniously announced today that they were acquiring a small academic company called reCAPTCHA, which builds software that tries to differentiate humans from algorithms on web submissions.

CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) are those random letters you have to enter when submitting a form on a webpage, often a comment (Computerword uses reCAPTCHA for our commenting system).  Interestingly, reCAPTCHA's founder Luis Von Ahn is one of the people who came up with the term in 2000.  They gave up trying to trademark it in 2008.

Wikipedia defines CAPTCHAs as: a type of challenge-response test used in computing to ensure that the response is not generated by a computer. The process usually involves one computer (a server) asking a user to complete a simple test which the computer is able to generate and grade. Because other computers are unable to solve the CAPTCHA, any user entering a correct solution is presumed to be human. Thus, it is sometimes described as a reverse Turing test, because it is administered by a machine and targeted to a human, in contrast to the standard Turing test that is typically administered by a human and targeted to a machine. A common type of CAPTCHA requires that the user type letters or digits from a distorted image that appears on the screen.

There are now many types of CAPTCHAS on the Internet.  What's interesting about reCAPTCHA is how it works (and is why it is doubley valuable to Google). reCAPTCHA takes passages from newsclippings, articles and old books that can't be read by OCR machines - the same OCR software that hackers are using to try to get through CAPTCHAs.  It then feeds it to humans one at a time with other words that it knows.  The user then enters both words.  The word that reCAPTCHA knows is tested - if correct, it now learns an additional word to use on other challenges. 

This accomplishes two things, both of which would be useful to Google. 

One, it helps Google keep automated machines from signing up for its many services.  It also keeps its comment spam on Blogger and its other content management systems to a minimum.

Perhaps most importantly, it provides a way for Google to harness the power of its users to help recognize passages in old or damaged works.  

The software isn't perfect.  Notorious/hilarious hacker group Anonymous (part of 4Chan) broke reCAPTCHA's technology to rig Time's 100 most important people of the year - the Marblecake incident.  They did this with a combined brute force/guessing algorithm. Google will undoubtedly try to avoid this type of exploitation in the future.

Why not try the reCaptcha system by leaving a comment below?

What People Are Saying

Computerworld version of CAPTCHA sux!

It is ridiculous to allow one attempt and lock
out an IP address if it fails! The CAPTCHA is
quite often impossible to interpret because of
the mashing of the various characters and fonts
together. You often have to select another one
over and over until one is able to be read!

If you should happen to get it wrong, your
comment ends up in limbo land until a real person
has a chance to look at it, whenever that happens
to take place.

You should allow more than one attempt or find
a better way to prevent SPAM! I have seen
several web sites that ask for an answer to a
simple question, or to include a couple words
from the article in question, etc.

Please fix this or you are going to find less
people that have time for the hassle.

Thank you Computerworld!

My original comment was responded to promptly
and either the captcha has been changed recently,
or there was another reason replies were being
flagged. Either way, I tested the captcha and
it is now allowing more than one attempt.

Thanks to Computerworld for caring about their
loyal readers!

Captcha != Spam

Computerworld blogs allow multiple attempts to submit a correct CAPTCHA. An error will result in "The reCAPTCHA code you entered was incorrect" followed by another CAPTCHA.

If your comment was marked as spam, it is because of the content. It will be reviewed and published shortly. We apologize for the delay but find this preferable to a looser spam policy that would have our readers wading through dozens of illegitimate comments.

Ken Gagne
Associate Editor,
Community Content
Computerworld.com

Testing..... This is a test.

Testing.....

This is a test. Had this been actual spam.........

Use GNU/Linux and you won't

Use GNU/Linux and you won't need captchas.

still vlnrbl

ye its helpful bt nt 4 log time i think its stone 4 hackers n 4 many othr usrs its dfntly gonna blow. intelignt softwr cn easily figrout thrw this n trust me many inteligent egncis has figured out this its just to come out of the office. i think we need to think beyond of this rather than just praising....

testing

testing

Test

Test

Obviously people can assume

Obviously people can assume the worst of Google and they can subvert the purpose of reCaptcha, but entering invalid data as a prank or mischief making activity seems juvenile. We have millions of pages of paper-based information that could be invaluable if correctly rendered via OCR technology, and if I can advance that process then I will do my best to correctly render any characters submitted.

The Fate of reCAPTCHA is doomed.

The more people that find out they are helping the computer to learn the words it can't figure out; the more people are likely to decieve the comptuer by guessing the word it didn't know and feeding it false information. I just did so with this post.