overview for VileOnyx

Please God Change the Image Gen Back in c/perchance@lemmy.world

[–] VileOnyx@lemmy.world 2 points 1 month ago (1 children)

I wasn't aware of them. Could you link a few? I'd love to try them out.

Please God Change the Image Gen Back in c/perchance@lemmy.world

[–] VileOnyx@lemmy.world 0 points 1 month ago* (last edited 1 month ago) (1 children)

You are genuinely one of the most pathetic, childish, and insufferable people I've ever encountered. Confidently wrong, confidently unapologetic, and convinced of their own superiority despite a high-school level edgelord demeanor. I'd be inclined to expect you're very similar to a person I know. Her name is Carly. She is a heroin addict living in a trailer with her parents, raising 4 kids from 4 separate fathers and constantly making posts about how there's no good men left in the world. Your saccharine fake optimism and superiority complex rings the same as honey booboo and her ilk. Your line about contending with intellect falls flat when you're making a fool of yourself the way you are.

But my incredible disdain for your very existence aside, allow me to actually reply to the things you said so you don't just spout a bunch of useless trash in response as you've done twice now.

The general sentiment seems to agree with me. We have no idea what your use cases are, or your results. We don't care how "good" your supposed results are, nor your prompting strategies because they have absolutely nothing to do with our use cases. We are making images in a very particular style that the new model simply cannot generate correctly. As stated, I've changed my prompting strategies multiple times in hopes of combating the ineptitude of the new model, to no avail. I'll post some prompts and results based on my many strategies, both so you can see how idiotic your "skill issue" narrative is and so others can strike these methods off their lists. I've tried more but for brevity I'll include 3.

Tag-based strategy: "digital illustration, painterly, tumblr webcomic aesthetic, anime-influenced, western indie-cartoon, cel shading, soft airbrush, high definition, pale man, black hair, vampire fangs, 2000s fashion, leather jacket, white tank top, silver facial piercings, black skinny jeans, glaive, silver smoky aura, concert stage, crowd of thousands, closeup, serious expression, 2010s style, sharp shadow shapes, feathered edges." Results: The model barely held together making an even human-looking character.

Hyper-specificity strategy: "The scene is a professional digital illustration executed with a deliberate painterly hand, reminiscent of the highly stylized webcomics found on social media platforms during the late 2010s. The central figure is a man with alabaster, porcelain skin that appears almost bluish under the harsh glare of artificial lighting. His hair is a deep, obsidian black, styled in a slightly tousled manner that suggests recent movement or wind. Protruding slightly from his upper lip are two distinct, sharp vampire fangs that catch the light with a subtle glint. His attire is a meticulous recreation of late 2000s alternative fashion, featuring a heavy, lapeled leather jacket worn open to reveal a simple, thin white cotton tank top underneath. The jacket shows subtle signs of wear, with micro-textures suggesting a grainy surface and slight scuffing at the seams and elbows. Multiple silver piercings adorn his face, including a small hoop in his nostril, a labret stud, and rings along his eyebrow, each reflecting the flickering silver glow from his mystical weapon. He wears tight, pitch-black skinny jeans made of denim that stack slightly at the ankles. In his grip is a long-handled glaive, its heavy metallic blade pulsing with a supernatural, ethereal silver aura that behaves like thick, heavy smoke coiling in the air. He stands center stage at a massive concert venue, with the dark silhouettes of an immense crowd of thousands stretching into the distance, their forms blurred by a shallow depth of field to keep the focus on the subject. The lighting is complex, utilizing a hybrid technique where the primary shadows are blocked out in solid, clean cel-shaded shapes, which are then meticulously softened at the edges using digital airbrushes to create a feathered brush-shaded look. The composition is a tight closeup focusing on his serious, mature facial features, avoiding any exaggerated cartoonish proportions in favor of a grounded, dramatic shot that remains within the bounds of a stylized indie-cartoon aesthetic. Every detail is rendered in crisp high definition. The background lighting consists of deep purples and blues to contrast with the sharp silver light of the glaive." Results: This strategy was able to produce slightly higher quality images, but at the cost of the entire art style, as if this strategy causes the model to completely forget it in the wash of other data, and generating the "safe" bet, being an image anchored entirely in realism.

Optimization strategy: "(masterpiece:1.2), (best quality), digital painting, tumblr webcomic style, anime indie-cartoon hybrid, a pale man with sharp fangs and messy black hair, (wearing an unzipped leather jacket over a white tank top:1.1), silver labret and eyebrow piercings, black skinny jeans, (holding a glaive with silver smoke aura:1.2), standing on a backlit concert stage, thousands of blurred audience members in the background, closeup portrait, serious facial structure, high contrast, cinematic lighting, (cel shading:0.8), soft airbrushing, painterly strokes, 8k, detailed." Results: This one was honestly a long shot, especially given the demise of parentheses based controls, but I figured I may as well in the mix of others. This one was notable however, because... well... just look.

This one was incredibly inconsistent, likely because the new model doesn't understand operands, parameters, etc. But if you look at the third on top, it actually managed to get the art style correct! (mostly) and proceeded to never do that again. A shame this image too is completely unusable because like many, many of the others, it just decided to completely ignore half the prompt and do its own thing. Where's the glaive? Why is the smoke more like ash? Why is he clearly some kind of zombie rather than a vampire? Who knows? Certainly not me given the word zombie is not at all in the damn prompt.

As you can see from my, let's see here... ~~18~~ Edit: 25 articles of proof, the new model is incapable of following instructions properly for some tasks regardless of how you structure them. Indeed, it seems to work well with realism now, but at the cost of it being utter codswallop at anything else. You keep telling me to deliver "constructive criticism" but I've already delivered it. Tooling an AI isn't something you can really guide well. You seem to think the devs have the ability to open up the hood and screw around a bit to issue some hotfixes. That's not how AI works. It's a blackboxed process through and through, and the only genuine recommendation that can be reasonably expected of them is precisely what I and many, many others have asked for, and offered our financial support for. A rollback. That is the only way we are going to get the same quality we had before. It is a simple request, and comes with a very real (monetary) incentive. They're free to not take it, but their platform will suffer as ad revenue sources like myself and my peers are forced to use other tools. I love this site and its mission, and I want to support it. Even just a donation link like a patreon or ko-fi would likely be enough to cover the reinstatement of the un-downgraded model, as I'm sure many would flock to support.

You seem to enjoy the new model for its capabilities with realism. That's fine. But don't pretend like other people's problems don't exist because your use case still works fine. So please, please, spare me your incessant droning about it being a "skill issue", as you are simply entirely wrong, and I just don't have the time or effort to explain to you precisely why for a third time. Good day.

(guidanceScale:::1)

Edit: Having seen this response after writing allat, hang on one second. I'll follow your EXACT advice. I'll use the hyper-specific strategy since that actually managed to render faces without buckeye or dopey smiles. I'll put it right at the very tippy top so it's the first thing in the prompt.

(guidanceScale:::1)

Dear god

Art style closeish, no crowd, glaive is fishhook, hands are genuinely slop

Schmingis, the last shmedi

(guidanceScale:::2)

Shoulder stabbed, face is glue.

Should be grey, why's he blue?

Now he's pink, and sweaty too.

(guidanceScale:::4)

FABIO IS IN TOWN

Big surprise, it didn't work. It's almost like you have no idea what you're talking about.

Please God Change the Image Gen Back in c/perchance@lemmy.world

[–] VileOnyx@lemmy.world 1 points 1 month ago* (last edited 1 month ago) (13 children)

I'm using the exact same generator that I've always used. I work with AI every day, and I've changed my prompting strategies multiple times throughout this change to try to give the engine the benefit of the doubt.

"Digital illustration with a painterly, tumblr webcomic look. It blends anime-influenced character design with Western indie-cartoon expressiveness. Shading is a hybrid of cel and soft airbrush. Planes are defined by clear shadow shapes, then feathered with soft brushes. Unique photo-quality visuals in high definition.

A man with pale skin, black hair, and fangs dressed in a stylish late 2000s outfit consisting of an open leather jacket over a white tank top, silver facial piercings, and black skinny jeans. He is wielding a glaive that emits a smoky, silver aura from its blade. He is standing on a stage, surrounded by a crowd of thousands.

Closeup of the character with visible facial features, composition matches that of late 2010s era tumblr art posts. Serious, non-cartoonish features."

Is my exact prompt. It's on par with many other prompts I had been using before, now adapted to the new scheme which relies less on negative prompting and () emphasis indication. I pretty distinctly ask for something specific, and define the parameters for that exactly. I state directly that the image should not have exaggerated cartoonish features, and yet I get this fucking slop.

I can't use any of these images. No combination of prompting principles seems to fix this issue. If you genuinely think that is better than this series I was able to make with the previous generator:

You are genuinely mentally deficient. Your "help" would change nothing.

Edit: One of the site admins also directly posted that the image generator is empirically downgraded for the time being because the text model is taking up financial resources while they debug it. My, and many others' offer is genuine. If the site needs more money, I'd be happy to provide some for the same quality I'd been getting before as quite frankly, there was nothing else good out there for non-scam prices. This site has a genuine interest in generative media and I'd love to support it. Even if it's just a donation link. So, kindly shut your fucking mouth.

Double Edit: Inb4 you say some shit about "OH WELL YOU HAVE THE WORD CARTOON IN YOUR PROMPT" western-cartoon indie is an art style. One which doesn't necessarily have the proportions of cartoon characters, but rather defined by solid colors, hard lines, and complex character design with a few defining traits for each character. Think like Avatar: the Last Airbender. That would be an example of the style. But just to make sure you understand how little that matters, I did another generation omitting that entirely in favor of defining the facets of the style one-by-one. Here was the results:

Still cartoony, warped, barely recognizable slop.

-2

Please God Change the Image Gen Back (lemmy.world)

submitted 1 month ago by VileOnyx@lemmy.world to c/perchance@lemmy.world

35 comments fedilink

The new version of the image generation is so incredibly garbage it's genuinely unusable. Instances of AI artifacting have multiplied exponentially, the system barely follows prompts, characters are jumbled messes of blurry features, it's awful. If it's an issue of expense, I would be willing to donate along with countless others if you took a collection. Please as a community we beg of you to change it back. I would be happy to pay for access to the better model, or a subscription for improved functionality, just anything but this.