Technology

82250 readers

4476 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

208

AI’s Memorization Crisis | Large language models don’t “learn”—they copy. And that could change everything for the tech industry. (www.theatlantic.com)

submitted 1 month ago by silence7@slrpnk.net to c/technology@lemmy.world

46 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] riskable@programming.dev 1 points 1 month ago (21 children)

but we can reasonably assume that Stable Diffusion can render the image on the right partly because it has stored visual elements from the image on the left.

No, you cannot reasonably assume that. It absolutely did not store the visual elements. What it did, was store some floating point values related to some keywords that the source image had pre-classified. When training, it will increase or decrease those floating point values a small amount when it encounters further images that use those same keywords.

What the examples demonstrate is a lack of diversity in the training set for those very specific keywords. There's a reason why they chose Stable Diffusion 1.4 and not Stable Diffusion 2.0 (or later versions)... Because they drastically improved the model after that. These sorts of problems (with not-diverse-enough training data) are considered flaws by the very AI researchers creating the models. It's exactly the type of thing they don't want to happen!

The article seems to be implying that this is a common problem that happens constantly and that the companies creating these AI models just don't give a fuck. This is false. It's flaws like this that leave your model open to attack (and letting competitors figure out your weights; not that it matters with Stable Diffusion since that version is open source), not just copyright lawsuits!

Here's the part I don't get: Clearly nobody is distributing copyrighted images by asking AI to do its best to recreate them. When you do this, you end up with severely shitty hack images that nobody wants to look at. Basically, if no one is actually using these images except to say, "aha! My academic research uncovered this tiny flaw in your model that represents an obscure area of AI research!" why TF should anyone care?

They shouldn't! The only reason why articles like this get any attention at all is because it's rage bait for AI haters. People who severely hate generative AI will grasp at anything to justify their position. Why? I don't get it. If you don't like it, just say you don't like it! Why do you need to point to absolutely, ridiculously obscure shit like finding a flaw in Stable Diffusion 1.4 (from years ago, before 99% of the world had even heard of generative image AI)?

Generative AI is just the latest way of giving instructions to computers. That's it! That's all it is.

Nobody gave a shit about this kind of thing when Star Trek was pretending to do generative AI in the Holodeck. Now that we've got he pre-alpha version of that very thing, a lot of extremely vocal haters are freaking TF out.

Do you want the cool shit from Star Trek's imaginary future or not? This is literally what computer scientists have been dreaming of for decades. It's here! Have some fun with it!

Generative AI uses up less power/water than streaming YouTube or Netflix (yes, it's true). So if you're about to say it's bad for the environment, I expect you're just as vocal about streaming video, yeah?

[–] OctopusNemeses@lemmy.world 12 points 1 month ago (2 children)

Do you want the cool shit from Star Trek’s imaginary future or not?

You lost me there. Conflating a fictional future utopia with the product your trying to sell is cheap trick.

Anyone who uses this bad faith tactic loses all credibility. Post read and disregarded.

[–] riskable@programming.dev -1 points 1 month ago

Sell? Only "big AI" is selling it. Generative AI has infinite uses beyond ChatGPT, Claude, Gemini, etc.

Most genrative AI research/improvement is academic in nature and it's being developed by a bunch of poor college students trying to earn graduate degrees. The discoveries of those people are being used by big AI to improve their services.

You seem to be making some argument from the standpoint that "AI" == "big AI" but this is not the case. Research and improvements will continue regardless of whether or not ChatGPT, Claude, etc continue to exist. Especially image AI where free, open source models are superior to the commercial products.

load more comments (1 replies)

load more comments (19 replies)