this post was submitted on 10 Jul 2023
420 points (94.7% liked)

Technology

34788 readers
616 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 102 points 1 year ago* (last edited 1 year ago) (40 children)

In evidence for the suit against OpenAI, the plaintiffs claim ChatGPT violates copyright law by producing a “derivative” version of copyrighted work when prompted to summarize the source.

Both filings make a broader case against AI, claiming that by definition, the models are a risk to the Copyright Act because they are trained on huge datasets that contain potentially copyrighted information

They've got a point.

If you ask AI to summarize something, it needs to know what it's summarizing. Reading other summaries might be legal, but then why not just read those summaries first?

If the AI "reads" the work first, then it would have needed to pay for it. And how do you deal with that? Is a chatbot treated like one user? Or does it need to pay for a copy for each human that asks for a summary?

I think if they'd have paid for a single ebbok Library subscription they'd be fine. However the article says they used pirate libraries so it could read anything on the fly.

Pointing an AI at pirated media is going to be hard to defend in court. And a class action full of authors and celebrities isn't going to be a cakewalk. They've got a lot of money to fight, and have lots of contacts for copyright laws. I'm sure all the publishers are pissed too.

Everyone is going after AI money these days, this seems like the rare case where it's justified

[–] [email protected] 16 points 1 year ago (4 children)

Can the sources where ChatGPT got it's information from be traced? What if it got the information from other summaries?

I think the hardest thing for these companies will be validating the information their AI is using. I can see an encyclopedia-like industry popping up over the next couple years.

Btw I know very little about this topic but I find it fascinating

[–] [email protected] 5 points 1 year ago (3 children)

Yes! They publish the data sources and where they got everything from. Diffusers (stable diffusion/midjoirny etc) and GPT both use tons of data that was taken in ways that likely violate that data’s usage agreement.

Imo they deserve whatever lawsuits they have coming.

[–] [email protected] 1 points 1 year ago (1 children)

likely violate that data’s usage agreement.

It doesn't seem to be too common for books to include specific clauses or EULAs that prohibit their use as data in machine learning systems. I'm curious if there are really any aspects that cover this without it being explicitly mentioned. I guess we'll find out.

[–] [email protected] 0 points 1 year ago (1 children)

I think with a book your standard digital license / copyright would forbid it, would it not?

[–] [email protected] 1 points 1 year ago

Maybe. I'm interested in the specifics.

load more comments (1 replies)
load more comments (1 replies)
load more comments (36 replies)