this post was submitted on 10 Jul 2023
206 points (100.0% liked)
Technology
37705 readers
114 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Dude, tell me, why do u think they have being doing this only with books and art but no music?
Thats because music really has people protecting their assets. U can have ur opinion about it, but that's the only reason they haven't ABUSED companies and people's work in music.
It's not reading, it's the equivalent of me taking a movie, making a function, charge for it, and then be displeased when the creators demand an explanation.
What is the meaning of "making a function" in your sentence?
Like showing in the theater.
Seems like my grammar's still shit, sry
There are a few reasons why music models haven't exploded the way that large-language models and generative image models have. Maybe the strength of the copyright-holders is part of it, but I think that the technical issues are a bigger obstacle right now.
Generative models are extremely data-inefficient. The Internet is loaded with text and images, but there isn't as much music.
Language and vision are the two problems that machine learning researchers have been obsessed with for decades. They built up "good" datasets for these problems and "good" benchmarks for models. They also did a lot of work on figuring out how to encode these types of data to make them easier for machine learning models. (I'm particularly thinking of all of the research done on word embeddings, which are still pivotal to large language models.)
Even still, there are fairly impressive models for generative music.
Example of music generation: MusicLM. The abstract mentions having to create a new dataset to get these results.
It's probably much harder with music.