this post was submitted on 29 Aug 2023
66 points (95.8% liked)
Technology
59132 readers
4134 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
TBF, I don't think the purpose of this watermark is to prevent bad people for passing AI as real. It would be a welcome side-effect but that's not why google wants this. Ultimately this is supposed to prevent AI training data from being contaminated with other AI generated content. You could imagine if the data set for training contains a million images generated with previous models having mangled fingers and crooked eyes, it would be hard to train a good AI out of that. Garbage in, garbage out.
So theoretically, those of us who put original images online could add this invisible watermark to make AI models leave our stuff out of their "steal this" pile?
Yea actually, that has a good "taste your own medecine" vibe
AI-generated images are becoming increasingly realistic, AI can't tell them apart anymore.
iirc AI models becoming worse after being trained with AI generated data is an actual issue right now. Even if we (or the AI) can't distinguish them from real images there are subtle differences that can be compounded into quite large differences if the AI is fed its own work over several generations and lead to a degraded output.
I’m not sure that’s the case. For instance, a lot of smaller local models leverage GPT4 to generate synthetic training data, which drastically improves the model’s output quality. The issue comes in when there is no QC on the model’s output. The same applies to Stable Diffusion.