this post was submitted on 21 Nov 2025

3 points (71.4% liked)

Perchance - Create a Random Text Generator

1867 readers

20 users here now

⚄︎ Perchance

This is a Lemmy Community for perchance.org, a platform for sharing and creating random text generators.

Feel free to ask for help, share your generators, and start friendly discussions at your leisure :)

This community is mainly for discussions between those who are building generators. For discussions about using generators, especially the popular AI ones, the community-led Casual Perchance forum is likely a more appropriate venue.

See this post for the Complete Guide to Posting Here on the Community!

Rules

1. Please follow the Lemmy.World instance rules.

The full rules are posted here: (https://legal.lemmy.world/)
User Rules: (https://legal.lemmy.world/fair-use/)

2. Be kind and friendly.

Please be kind to others on this community (and also in general), and remember that for many people Perchance is their first experience with coding. We have members for whom English is not their first language, so please be take that into account too :)

3. Be thankful to those who try to help you.

If you ask a question and someone has made a effort to help you out, please remember to be thankful! Even if they don't manage to help you solve your problem - remember that they're spending time out of their day to try to help a stranger :)

4. Only post about stuff related to perchance.

Please only post about perchance related stuff like generators on it, bugs, and the site.

5. Refrain from requesting Prompts for the AI Tools.

We would like to ask to refrain from posting here needing help specifically with prompting/achieving certain results with the AI plugins (text-to-image-plugin and ai-text-plugin) e.g. "What is the good prompt for X?", "How to achieve X with Y generator?"
See Perchance AI FAQ for FAQ about the AI tools.
You can ask for help with prompting at the 'sister' community Casual Perchance, which is for more casual discussions.
We will still be helping/answering questions about the plugins as long as it is related to building generators with them.

6. Search through the Community Before Posting.

Please Search through the Community Posts here (and on Reddit) before posting to see if what you will post has similar post/already been posted.

founded 2 years ago

MODERATORS

eatham@lemmy.world

eatham@aussie.zone

VioneT@lemmy.world

perchance@lemmy.world

"Real Photos" are actually cartoons (lemmy.world)

submitted 6 months ago by Phenomenologist@lemmy.world to c/perchance@lemmy.world

5 comments fedilink hide all child comments

I have been experimenting with the "Casual Photo" generator, with mixed results. I find that if I am very careful, I can avoid extra limbs and weird fingers etc., but once I get too specific with my descriptions, all I get back is cartoons, whereas I really want realistic photographs. For example:

"An editorial photograph of 58 year old tall slim English woman standing in the reception of a high end hotel."

This returns lots of low quality results, but a couple that are actually photographic (in the sense that but for minor details in rendering that don't stand out, it would look like a photograph - the texture of the skin, lighting, colour, etc.). This for example:

Once I add a bit more detail for example, all I get is low quality, cartoon-like results with android-like plastic looking unnatural skin and hair, a cross between and Android and a 20 year old woman's Instagram posts (despite specifically asking for a late middle-aged woman):

"Beautiful 58 year old white English woman with dark silver hair, large soulful brown eyes who is fully dressed but has slightly saggy boobs and broad hips, other than that, she is of an average build, and has a pleasantly plump face. She is dressed in a quirky way that shows she is artistic and intellectual. She looks kind but also strong. She is standing, smiling slightly, outside a stucco-fronted terraced house in Chelsea, London."

This is exactly the TYPE of woman I want to convey, but this does NOT look anything like a casual photo; it is very easy to tell straight away from the texture of her skin and the overly perfect colouring that she isn't a real person. I would say this is more like a cartoon, even if it isn't full on stomach-turning anime. Note that I don't particularly care about the weird hands, but the lack of realism, which has been dramatically worse since the terrible "upgrade" a few months ago. That was the best of a bed bunch. Most came out looking like this:

Which is about as realistic as a blow-up doll, as well as the fact that they have made her look about 20, and an Instagram attention-whore at that (goes to the bathroom to vomit).

This also seems to happen as soon as I add other people to the picture, or have more than one instance of Perchance open at once, perhaps when I am using too much server power? Could someone please explain. I am happy to generate things more slowly if that means more high photographic quality.

So the tl;dr version is this:

When is Perchance going to return to the (relatively) high quality images of several months ago? When there was the update done to the silly story-creator thing (for the terminally unimaginative who are incapable of coming up with their own stories), it was said that an upgrade to the photo generator would be done immediately after, instead we have a permanent downgrade.
Within the limits of what we have now, how can I get higher quality pictures and photos?
Is there a casual photo option that actually generates a casual photo, as in one that could be taken on a phone, rather than a cartoon image or a heavily filtered teenage Instagram image?

you are viewing a single comment's thread
view the rest of the comments

[–] Phenomenologist@lemmy.world 2 points 6 months ago (1 children)

Hi Arch thanks for your response.

I actually find that negative prompting makes it worse, in keeping with the above principle that the more detail specified the less accurate the output is in terms of common sense, anatomy and so on. If I specify nothing negative, the results are mostly at least anatomically accurate, but if I specify that I don't want multiple heads or tattoos, it is likely I will get these. For example the following prompt gets me some pretty good results, like the below (which is a painting still, but somewhat photographic, and I actually managed to get some real photographs before):

"A candid photograph of a pretty, middle class, 56 year old French woman who is slim with broad hips and graying hair that still has some of the original brown. She has prominent crow's feet around her eyes and her skin accurately reflects her age all over. She is dressed with casual, artsy elegance, in a blouse, silk scarf, leather jacket, baggy cords, and with brown boots. She is standing at the bar of a cafe in Paris. Her hair is like a kaleidoscope of different colours, but has grayed considerably."

Once I include the following negative prompt: "tattoos. writing. multiple pictures. bad anatomy. cartoons. anime. unrealistic skin. instagram-style filters. drawn images. a painting. purely gray hair. low quality hands. nudity." - there is no real improvement, and if anything the quality is slightly lower, below representative, though at least they were all vaguely anatomically correct:

Once I add the instruction to include another person, all accuracy goes out the window:

"A candid photograph of a pretty, middle class, 56 year old French woman who is slim with broad hips and graying hair that still has some of the original brown. She has prominent crow's feet around her eyes and her skin accurately reflects her age all over. She is dressed with casual, artsy elegance, in a blouse, silk scarf, leather jacket, baggy cords, and with brown boots. Her hair is like a kaleidoscope of different colors, but has grayed considerably. She is sitting discussing philosophy in a cafe in Paris with her 30 year old male lover. The lover is blond, clean-shaven and is wearing a tweed jacket, blue jeans and subtle sneakers." (Same negative prompt as above)

Produces at best this:

But mostly garbage like this:

Is there a guide that discusses how best to use negative prompts? Or how to prompt at all. I am frustrated because I spent ages yesterday evening getting amazing, photographic results, and tonight it has all deteriorated. This leads me to suspect the servers are deliberately giving me bad results due to high traffic or some other reason that would depend on the time of day.

[–] justpassing@lemmy.world 3 points 6 months ago (1 children)

The main issue is that you are dealing with an LLM at the end of the day, so what works for example in a Craiyon would not work here 1:1. Keep in mind that what happens under the hood, is that the model takes your input and tries to relate it to what is tagged in those terms to its training data. Probably, the prevalence of "Instagram plastic dolls" and similar, is due to the input having some detailed anatomical descriptors.

That being said, the best way to debug this, is just by checking what works for others in other generators, for example, here is a quick run in AI Photo Generator with an apparent very minimal prompt:

Probably this is far from the quality that you want, but it gives you a hint on "how" those are being made if you click on the top left corner of any. There you may see something like this:

Just to copy the prompt:

Old lady drinking coffee in a Parisian bistro, cinematic shot, dynamic lighting, 75mm, Technicolor, Panavision, cinemascope, sharp focus, fine details, 8k, HDR, realism, realistic, key visual, film still, cinematic color grading, depth of field.

Overall, it's an absolute world-class cinematic masterpiece. It's an aesthetically pleasing cinematic shot with impeccable attention to detail and impressive composition.

You see that there is a lot more than what it is actually in the original prompt? Probably if you use one of those generators, the inclusion of those photographic terms such as "cinemascope" or "HDR" may yield results that can be beneficial or harmful. Ideally, you want to just take a look at the full prompts, and then test on a bare bones image generator so you have more control of the output.

Now, text to image is different than text to text or text to code, you want to be as terse as possible, almost as if you were making a shopping list. For example, the following prompt:

- Realism
- Realistic
- Photographic shot
- Middle class
- 56 year old French woman
- Slim with broad hips
- Graying hair
- Prominent crow’s feet around her eyes
- Dressed with casual
- Silk scarf
- Leather jacket
- Baggy cords
- Standing at the bar of a cafe in Paris

Yields the following for seed: 354188953 and guidanceScale: 1

And I get it, it may not be up to your expectations, but you see how it makes infinitely easy to debug what term leads the model where you want it to go.

The best advise I can give you, is to look at the many different generators that there are and check what prompt is linked to a "style", because surely, what you are exactly looking for, someone has figured out and pasted it into some "Photograph realistic style", or at least it can serve as a reference point.

[–] Phenomenologist@lemmy.world 1 points 6 months ago (1 children)

That is a very helpful post. One thing I don't understand is the "seed" idea. I assumed this was so that you can get the same person over and over again? but I tried including the seed in a new iteration, just by including "(seed:::64482721)" and I got a completely new set of people. I did find more terse descriptions helped a lot, which is strange because some of the generators say use as much detail as you like, and the AI brain text generator thing also works on this principle. It is frustrating to me that I find a great granny (well, more middle-aged lol) and can't generate her again in other settings, so info about seeds would help feed my weakness for elegant, cultured grannies!

[–] justpassing@lemmy.world 1 points 6 months ago

Well... that whole thing is an entire rabbit hole. You see (and I'm trying to be as compact as possible, but there are a million of videos and documentation on the matter), an LLM and similar try to take the inputs and order of inputs to "correlate" them with something in a data bank. This whole is called "tokenization", and basically it turns "The orange cat is sleeping" into "A + B + C + D + E" where each variable is a "token" and often times, a single word as in the backend, the model breaks the tokens by whitespace, although, with some training "The cat" can be a single token, leading to a whole other universe of possible replies branching "cat" from "The cat". This is why (naively), some people recommend "add as much detail" in the sense of something like "An old lady in Paris, discussing an intellectually difficult topic such as philosophy with a young blonde man", instead of "old lady with blonde young man, discussing, focused, Paris". Both yield different results, but one is driven a lot by the context of articles, prepositions, and whatnot, making it a nightmare to debug. Again, be very descriptive, but separating things allow for easier "debugging" if you will. Also, I should mention that repeating a word does have an effect, as you'll see that the results from "old lady, scarf, drinking wine" is not the same as "old lady, scarf, scarf, scarf, drinking wine". That's why I emphasize that the "grocery list" approach is better, as you can take generating an image as "building a Lego" and see what piece does what.

Now, regarding the seed... that's another whole problem. There is a better explanation in a video by Wolfram but I don't remember which one it was, but pretty much, the seed locks you into a "potential state", and not a single output, if that makes sense. So, if you reroll a seeded image, you'll get potentially 5 diametrically different outputs with some accessory chances, plus some eldritch abomination of the model mixing them, but no more. So with a seed, you can find the exact granny you found once, but you may still require the luck of the draw. The reason for this is actually a bit complex and I'll admit I don't get it fully, but I recall it being also an issue in other neural network models such as Random Forest and similar, where seeds would not yield a 1:1 result always.

Then again, nothing beats downloading the image! A fun feature that perchance has, is that all images are coded in base64, so you can right click a generated image, do "Copy Link", take the gargantuan link, put it on a .txt and then use that gargantuan string of text to pass it to a converter and have it on your drive or even use it directly on an app or HTML!