this post was submitted on 27 Sep 2024
777 points (98.4% liked)

Technology

59091 readers
5373 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Anyone who has been surfing the web for a while is probably used to clicking through a CAPTCHA grid of street images, identifying everyday objects to prove that they're a human and not an automated bot. Now, though, new research claims that locally run bots using specially trained image-recognition models can match human-level performance in this style of CAPTCHA, achieving a 100 percent success rate despite being decidedly not human.

ETH Zurich PhD student Andreas Plesner and his colleagues' new research, available as a pre-print paper, focuses on Google's ReCAPTCHA v2, which challenges users to identify which street images in a grid contain items like bicycles, crosswalks, mountains, stairs, or traffic lights. Google began phasing that system out years ago in favor of an "invisible" reCAPTCHA v3 that analyzes user interactions rather than offering an explicit challenge.

Despite this, the older reCAPTCHA v2 is still used by millions of websites. And even sites that use the updated reCAPTCHA v3 will sometimes use reCAPTCHA v2 as a fallback when the updated system gives a user a low "human" confidence rating.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 43 points 1 month ago (1 children)

Aren't these Captchas designed to get training data for AI models anyway?

"System does what it was designed to do" doesn't feel that surprising...

[–] [email protected] 5 points 1 month ago (1 children)

Aren’t these Captchas designed to get training data for AI models anyway?

Yes and no, the captchas are just meant to be hard for computers to solve but easier for humans. People saw that, and thought that "if we're making people do this might as well have them do something useful" not meant to be malevolent- and the purpose is still stopping bots, training them is a side-effect.

[–] [email protected] 3 points 1 month ago (1 children)

No, you're wrong, the Traffic Light examples ARE specifically to gather data to train models. Being a good Captcha was just a byproduct of that. If people just wanted a good captcha they wouldn't need hundreds of millions of photos of street lights and bicycles.

[–] [email protected] -2 points 1 month ago (1 children)

No, you’re wrong, the Traffic Light examples ARE specifically to gather data to train models.

No you're wrong, because the sites that embed those captchas on their page are not doing that to help good.

If people just wanted a good captcha they wouldn’t need hundreds of millions of photos of street lights and bicycles.

Yes, they are getting something productive out of the human labor that would be done anyways. Trust me as a web developer, and web scraper, some kind of captcha is necessary for many free services to be useful/economically viable. The core of a good captcha is just making it marginally more expensive for the scraper/bot than it is for you.

[–] [email protected] 2 points 1 month ago (1 children)

The sites don't create the captcha, you yourself just said it was embedded there.

[–] [email protected] -2 points 1 month ago (1 children)

They embed for a reason... And the captchas wouldn't exist if they weren't embedded anywhere

[–] [email protected] 1 points 1 month ago (1 children)

Finitebanjo is right. Yes they are used to fight spam and bots but they way they do it us is picked intentionally to train ai.

https://medium.com/@yennhi95zz/how-google-trains-ai-with-your-help-through-captcha-876cb4eb4d01

Also from the Wikipedia article "Google profits from reCAPTCHA users as free workers to improve its AI research." https://en.m.wikipedia.org/wiki/ReCAPTCHA

[–] [email protected] 0 points 1 month ago

they do it us is picked intentionally to train ai.

Yes like I said, the challenges were picked to be useful. But some form of challenge would've been chosen regardless.