this post was submitted on 19 May 2025
971 points (98.1% liked)
Microblog Memes
7647 readers
1874 users here now
A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.
Created as an evolution of White People Twitter and other tweet-capture subreddits.
Rules:
- Please put at least one word relevant to the post in the post title.
- Be nice.
- No advertising, brand promotion or guerilla marketing.
- Posters are encouraged to link to the toot or tweet etc in the description of posts.
Related communities:
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The thing is that LLM is a professional bullshitter. It is actually trained to produce text that can fool ordinary person into thinking that it was produced by a human. The facts come 2nd.
Yeah, I know. I use it for work in tech. If I encounter a novel (to me) problem and I don't even know where to start with how to attack the problem, the LLM can sometimes save me hours of googling by just describing my problem to it in a chat format, describing what I want to do, and asking if there's a commonly accepted approach or library for handling it. Sure, it sometimes hallucinate a library, but that's why I go and verify and read the docs myself instead of just blindly copying and pasting.
That last step of verifying is often being skipped and is getting HARDER to do
The hallucinations spread like wildfire on the internet. Doesn't matter what's true; just what gets clicks that encourages more apparent "citations". Another even worse fertilizer of false citations is the desire to push false narratives by power-hungry bastards
AI rabbit holes are getting too deep to verify. It really is important to keep digital hallucinations out of the academic loop, especially for things with life-and-death consequences like medical school
This is why I just use google to look for the NIH article I want, or I go straight to DynaMed or UpToDate. (The NIH does have a search function, but it's terrible meaning it's just easier to use google to find the link to the article I actually want.)
I don’t trust LLMs for anything based on facts or complex reasoning. I’m a lawyer and any time I try asking an LLM a legal question, I get an answer ranging from “technically wrong/incomplete, but I can see how you got there” to “absolute fabrication.”
I actually think the best current use for LLMs is for itinerary planning and organizing thoughts. They’re pretty good at creating coherent, logical schedules based on sets of simple criteria as well as making communications more succinct (although still not perfect).
The only substantial uses i have for it are occasional blurbs of R code for charts, rewording a sentence, or finding a precise word when I can't think of it
It's decent at summarizing large blocks of text and pretty good for rewording things in a diplomatic/safe way. I used it the other day for work when I had to write a "staff appreciation" blurb and I couldn't come up with a reasonable way to take my 4 sentences of aggressively pro-union rhetoric and turn it into one sentence that comes off pro-union but not anti-capitalist (edit: it still needed a editing pass-through to put it in my own voice and add some details, but it definitely got me close to what I needed)
I'd say it's good at things you don't need to be good
For assignments I'm consciously half-assing, or readings i don't have the time to thoroughly examine, sure, it's perfect
exactly. For writing emails that will likely never be read by anyone in more than a cursory scan, for example. When I'm composing text, I can't turn off my fixation on finding the perfect wording, even when I know intellectually that "good enough is good enough." And "it's not great, but it gets the message across" is about the only strength of ChatGPT at this point.
To be fair, facts come second to many humans as well, so I dont know if you have much of a point there...
That's true, but they're also pretty good at verifying stuff as an independent task too.
You can give them a "fact" and say "is this true, misleading or false" and it'll do a good job. ChatGPT 4.0 in particular is excellent at this.
Basically whenever I use it to generate anything factual, I then put the output back into a separate chat instance and ask it to verify each sentence (I ask it to put tags around each sentence so the misleading and false ones are coloured orange and red).
It's a two-pass solution, but it makes it a lot more reliable.
So your technique to "make it a lot more reliable" is to ask an LLM a question, then run the LLM's answer through an equally unreliable LLM to "verify" the answer?
We're so doomed.
Give it a try.
The key is in the different prompts. I don't think I should really have to explain this, but different prompts produce different results.
Ask it to create something, it creates something.
Ask it to check something, it checks something.
Is it flawless? No. But it's pretty reliable.
It's literally free to try it now, using ChatGPT.
Hey, maybe you do.
But I'm not arguing anything contentious here. Everything I've said is easily testable and verifiable.