Are You Sure You’re Not a Robot?
Down and out on the CAPTCHA farm.
I do not remember the first time I failed to solve a CAPTCHA, but I remember the most recent. An alert popped up on my screen – “There are errors on your account” – and, after successfully solving just five, I was shadowbanned from my new gig as a CAPTCHA solver. I had earned a grand total of one-fifth of one cent.
A team of researchers at Carnegie Mellon University developed CAPTCHA, which is a contrived acronym for “Completely Automated Public Turing Test to Tell Computers and Humans Apart,” in the early 2000s. CAPTCHAs meaningfully distinguished human and bot activity online for the first time. This advance curbed automated scourges of the internet such as the mass creation of spam email accounts and illegitimate answers in online polls, and they remain today an essential tool that makes the modern internet usable.
But the computer scientist John Langford, who helped create CAPTCHA twenty years ago when he was a graduate student, is surprised they have lasted as long as they have. “I kind of expected machine learning to eventually succeed in making CAPTCHAs not a thing,” he told me in an interview. “But that hasn’t fully happened yet.”
Instead of disappearing, CAPTCHAs have instead gotten more complex and more prevalent. It seems one can’t go a day on the internet without needing to solve one. However, reading about the recent advances of AI – from making art to writing poetry – I have become increasingly irked (insulted, even) by these tests that purport to divide the human from the bot. I can foresee a near future where AI will be able to do everything that we can online – a feeling echoed by another creator of CAPTCHA, Luis von Anh. This feeling has made me more protective of my sense of humanity, and less willing to prove it to Google every time I want to log in. We are surrounded by AI online. Why should I be singled out, when I’m sitting right here, flesh and blood?
So I decided to introduce a challenge for myself: refuse to complete CAPTCHAs, in a refusal to prove my humanity to a computer. I understood the purpose of CAPTCHAs was noble and made the internet a better experience, through reducing spam and preventing bots from buying up all the concert tickets, for instance. But still, I wanted to try. In the meantime, I would try to figure out why CAPTCHAs were still around and what their evolution may suggest about the future of the internet. As I dove into the story, though, I came to see another wrinkle: how labor and AI may interact in the future.
In the early 2000s, the internet had a problem: as spammers got better, they were able to write programs that could create countless free email accounts on services like Yahoo! Mail in seconds. This led to an explosion of spam and other online services, such as polls meant to gauge public opinion, lacking accuracy.
A team of graduate students at Carnegie Mellon – Luis von Ahn, John Langford, and Nicholas Hopper, along with their advisor Manuel Blum – tried to come up with a solution. The group knew they needed some way to differentiate humans and computers online. However, the test needed to be solvable by every human and have a low success rate by computers.
Eventually, the team settled on text recognition. They would distort an image of a word and ask the user to identify what it read. This worked much better than previous experiments: computers were terrible at reading distorted text. Humans, meanwhile – even if they didn’t know what a specific word meant – were much better at identifying what letters were present. One didn’t even need to be literate to solve a CAPTCHA, because they just required a person to match the letters on the screen to the letters on a keyboard.
The test went into effect at Yahoo! Mail and was quickly being used millions of times a day. Over the next decade, however, a few things happened. First, Google bought an updated version of the technology called reCAPTCHA that served to digitize huge amounts of old text. By serving each user two words – one artificially distorted, and one from an old New York Times article – the computer could transcribe those articles using unwitting human participants. Second, though, computers got better at identifying distorted text – to such an extent that, according to a 2014 internal Google study, AI could read the most distorted CAPTCHAs at a rate of 99.8% accuracy. Humans, meanwhile? Only 33%.
This, according to John Langford, was always part of the plan. CAPTCHAs were designed as a win-win situation: either they kept computers out, or they helped computers break some heretofore unbreakable problem. Thanks to CAPTCHAs, computers could now read distorted text.
The next development, then, was to trade text for images. A Google reCAPTCHA may ask you to identify the boundaries of a motorcycle by clicking on the squares where that motorcycle exists. An hCAPTCHA (Google’s main independent competitor) may ask you to distinguish between boats moving left and boats moving right.
These new CAPTCHAs, it has seemed to me, are both more difficult and less accurate. I’ve run across tests that ask me to identify buses or cars in an image that has neither; and, of course, I’ve puzzled over whether I should click squares that have only small slices of the edge of a traffic light. I’m not alone in this feeling. Twitter is replete with internet users complaining about CAPTCHAs that are confusing or just plain wrong. “I always be overthinking these,” one meme reads, next to an image of a traffic light whose edge drifts barely into another square, with a profusely sweating Jordan Peele from Key and Peele. CAPTCHAs have two uses: to keep bots out, and to train AI systems to better accomplish certain tasks. (It has been reported that Google uses its reCAPTCHA to train self-driving cars, but Google denies this and claims that its CAPTCHAs are instead used to train Google Maps.) Those difficult tests, then, serve to crowdsource difficult problems.
Nonetheless, machine learning has not progressed enough to solve image-based CAPTCHAs at a high rate, which is why they’re still around. The reason, Langford tells me, is that there is just not that much value in hacking CAPTCHAs, and machine learning research is mostly focused elsewhere. “CAPTCHAs are kind of a stumbling block for completely automated usage” of the internet. “With encryption, you crack the code and everything works. [But] with CAPTCHAs, the value of breaking it is much lower because a free email account is not that valuable. And maybe that’s why people haven’t tried.”
CAPTCHAs are not used for great security, and so spammers have not invested the Herculean effort to create machine learning systems that could defeat them at a high rate. But spam remains ever-present, accounting for 45% of global email traffic in December 2021. That’s because spammers have another, more cost-effective way to get around CAPTCHAs: cheap labor.
My self-imposed CAPTCHA ban stopped me from applying to a job – if they can’t even trust I’m a real human, that doesn’t sound like the kind of company I’d want to work for, I told myself. It also prevented me from checking my full astrological report, which was also probably for the best. Is stubbornness rooted in my rising sign? I’ll never know.
I decided I didn’t need to login to my Airbnb account after all. I dropped my desire to prepay for movie tickets. When watching a soccer game using my VPN, I was prevented from using Google search at all, which told me that it had “detected unusual traffic” from my IP address: perhaps not surprising due to what I assume are the main VPN uses of torrenting movies, downloading porn, and watching obscure location-blocked telenovelas.
But I gave up on my self-imposed ban when I decided it was time to dig into the cheap labor that keeps spammers operating.
Despite the “low value” of email addresses and other internet functions protected by CAPTCHAs, there remains a CAPTCHA-cracking industry of unclear size. A quick Google search turns up several websites that promise cheap and quick CAPTCHA solving for very low rates. These websites – “CAPTCHA farms” – offer to solve 1,000 reCAPTCHAs for about three dollars. 1,000 text CAPTCHAs, meanwhile, will cost you only one dollar.
These companies – 2CAPTCHA, Death by Captcha, Kolotibablo, and more – operate openly, with contact information, testimonials, and various accepted payment methods. But who is behind them, and the identities of their client base, are unclear.
So after a couple weeks of refusing to do CAPTCHAs, I figured I’d break my ban and try out this “guaranteed way to have additional income in Internet [sic],” according to the company 2Captcha. Plus, if this writing stuff didn’t work out, it might be nice to have a backup.
I signed up for an account on Kolotibablo, and within minutes I was able to solve basic text-recognition CAPTCHAs. (Solving reCAPTCHAs required downloading some root-access software to my computer, the idea of which didn’t thrill me.)
In total, I solved only those five CAPTCHAs in around ten minutes, because of low demand for the basic CAPTCHAs I was authorized for, before typing an extra “2” and getting banned. At least someone paid me a little bit to prove my humanity, I thought – even if I didn’t prove it consistently, or for very long. I blamed my failure on being out of practice.
But who is solving thousands of CAPTCHAs, spending hours typing numbers and identifying traffic lights, for pennies? It turns out that, perhaps unsurprisingly, these companies rely on labor from some of the most economically depressed regions of the world. About a quarter of workers on the site Anti-Captcha are from Venezuela, according to data from the site itself; Indonesia, Vietnam, India, Pakistan, the Philippines, and Ukraine round out the top of the list. Kolotibablo, meanwhile, has an entire website dedicated to sharing the “stories” of its workers. These companies say that workers make between twenty-five and eighty cents per hour.
People write there of facing extreme financial hardship, of being thankful to Kolotibablo for providing a small stream of income in regions where there is otherwise little or no work. Marylu, an older woman from Mexico, says that she cannot find work because of her age, and therefore decided to sign up for the service. Others, meanwhile, talk of mental or physical disabilities that keep them at home. It is unclear how many know who they are helping, or that they are facilitating spam. “It’s a good and honest form of making money,” Marylu writes in Spanish, under photos of her dog, her home, her breakfast, and her computer.
These jobs allow people to work from anywhere with an internet connection, with only their smartphone or computer. And to be sure, two to four dollars a day can stretch further in many regions of the world than in the United States. But it is also clear that these companies care little for the workers’ well-being, despite their claims that they provide easy employment for virtually anyone. I was able to join three Facebook groups for CAPTCHA solvers worldwide, each with thousands of members. These posts are riddled with complaints by workers that they have been banned from their platform for unclear reasons. “Any advice for when a Kolotibablo account that despite not being banned stops giving you tasks?” one member wrote, in Spanish. Another advises that his account may be secretly banned, like mine was – he can create a new account, but he will lose whatever unwithdrawn earnings were in the other. Others in the group beg for someone to unban their account – which Kolotibablo allows users to do, for a fee of “2,000 reCAPTCHA points.”
Most of the CAPTCHA-solving companies did not reply to my request for comment, except for Death by Captcha. They touted their ability to solve many popular types of CAPTCHAs, and the ability for people from all around the world to solve CAPTCHAs for them – “whoever wants to work can solve captchas for us” – before refusing further questions about the locations and earnings of its workers. Others, though, seem sensitive to the dissatisfaction of some workers about their low earnings. 2Captcha, in an email to a worker posted on Facebook, pleads for good online reviews and decries those who failed to “[understand] how hard the job is and that earnings are not really high.” That seems certainly true. The highest earner in the world on Kolotibablo over the past seven days, a user from Poland, had solved over 106,000 reCAPTCHAs in that time. They earned a grand total of $110.45.
Anti-Captcha, meanwhile, features an animation on their website that shows their attitude toward workers. “Our advanced quality control system monitors workers’ entries and quickly eliminates cheaters,” it says, above a video of a cartoon superhero with a laser rifle, shooting it into the back of a suited worker who has “Cheater Detected” written above his head. After the cheater is zapped, he disappears into the ground and a crane drops a new worker in his place by the neck. There will always be low-cost labor, it seems to say, somewhere in the world.
The future of CAPTCHAs may not look like the image or text questions to which we have grown so accustomed. John Langford told me that he thinks that the result of a manual CAPTCHA test is probably only one of several signals that providers use to determine humanity – and maybe not a very important one at that. Even though I had refused to answer CAPTCHAs for a few weeks, the reality was that some kind of CAPTCHA was most likely running in the background at all times. I was “proving” my humanity without even knowing it. But what does human activity look like? Who was determining what was “human” anyway, and what did those determinations look like?
Langford, for one, believes that the trade-off for CAPTCHAs is worth it. They’re annoying, yes. And he says that he has failed a CAPTCHA, just like the rest of us. However, “To me a CAPTCHA seems like a necessary evil,” he says. “You can either pay for your accounts or you can have some sort of barrier to complete automation … I think it’s desirable to be able to provide things to people [for free] to help them go about their day.”
But with the prevalence of CAPTCHA farms, and the likelihood of machine learning’s continued advancement, it is unclear how effective they remain. Maybe their primary value, now, is the ability to get millions of answers to questions of a company’s choosing (Where is the traffic light? What does a spoon look like?) that will influence the makeup of some future AI product. They may well continue to exist as long as we continue to do them – or hire someone else to – whether or not they are effective at keeping bots out.
CAPTCHA farms have been around for at least a decade. But they are part of a broader trend to use poorly paid labor to solve tech problems. The Canadian fast-casual chain Freshii recently announced that it would start using remote cashiers at certain locations in Ontario – paying workers in Nicaragua $3.75 an hour instead of hiring in-store workers at the local minimum wage, for example. As we venture into a brave new world of prevalent AI, I wonder if there will be a point where technology will facilitate finding cheap labor – instead of facilitating a fully autonomous future. AI and its development is expensive, the cutting edge limited to Big Tech and certain universities. Much cheaper is finding ways to exploit the inequalities of the world. Until then, I’ll go back to doing CAPTCHAs – but just to prove my humanity, not for extra cash.
Thanks for reading RETURN! Subscribe to receive new posts.