Ask Lemmy
A Fediverse community for open-ended, thought provoking questions
Rules: (interactive)
1) Be nice and; have fun
Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them
2) All posts must end with a '?'
This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?
3) No spam
Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.
4) NSFW is okay, within reason
Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com.
NSFW comments should be restricted to posts tagged [NSFW].
5) This is not a support community.
It is not a place for 'how do I?', type questions.
If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world. For other questions check our partnered communities list, or use the search function.
6) No US Politics.
Please don't post about current US Politics. If you need to do this, try !politicaldiscussion@lemmy.world or !askusa@discuss.online
Reminder: The terms of service apply here too.
Partnered Communities:
Logo design credit goes to: tubbadu
view the rest of the comments
What I want from AI companies is really simple.
We have a thing called intellectual property in the United States of America. If I decided to make a Jellyfin instance that I charged access to, containing material I didn't own, somehow advertising this service on the stock market as a publicly traded company, you would bet your ass that I'd have a 1 way ticket to a defense seat in court.
AI companies, otherwise, operate entirely on data they don't own and don't pay licensing for ANY of the materials that are used to train their neural networks. So, in their eyes, any image, video (tv show/movie) or book that happens to be posted on the Internet is fair game in their eyes. This isn't how intellectual property works for individuals, so why exactly would a publicly traded company have an exception to this rule?
I work a lot in the world of FOSS and have a firm understanding that just because code is there doesn't make it yours. This is why we have the GPL for licensing. In fact, I'll take it a step further and say that the entirety of AI is one giant licensing nightmare, especially coding AI that isn't actually attributing license details with the code they're sampling from. (Sampling code being notably different than, say, learning from. Learning implies self-agency, and not corporate ownership.)
It feels to me that the AI bubble has largely been about pushing AI so hard and fast that people were investing in something with a dubious legal state in the US. Nobody stopped to ask whether or not the data that Facebook had on their website (for example, they aren't alone in this) was actually theirs to own, and what the repercussions for these types of decisions are.
You'll also note that Tech and Social Media companies are quick to take ownership of data when it benefits them (artists works, intellectual property that isn't theirs, random user posts about topics) and quick to deny ownership when it becomes legally burdensome (CSAM, illicit drug deals, etc.) to a degree that no individual would be granted. Hell, I'm not even sure a "small" tech startup would be granted this level of double-speak and hypocrisy.
With this in mind, I am simply asking that AI companies pay for the data that they're using to train AI. Additionally, laws must be in place that allows for the auditing of all materials used to train an AI with the legal intent of verifying that all parties are paid accordingly. This is how every other business works. If this were somehow granted an exception, wouldn't it be braindead easy to run every "service" through an AI layer in order to bypass any and all copyright laws?
Otherwise, if facebook and others want to claim that data hosted on their website is theirs to own and train off of -- well, great, but there should be no exceptions to this and they should not be allowed to host materials they then have no ownership over. So pictures of IP they don't own or materials they want to claim they have no ownership over must be removed from the platform. I would much prefer the first of these two options, however.
edit: I should note, that AI for educational purposes could be granted an exception for this under fair use (for university) but would still also be required to site all sources used to produce the works in question (which is normal for academics, in the first place.) and would also come with some strict stipulations on using this AI as a "product" (it would basically be moot, much like some research papers). This basically the furthest I'm willing to give these companies.