this post was submitted on 09 Feb 2026

569 points (98.8% liked)

Technology

81026 readers

5586 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

569

Chatbots Make Terrible Doctors, New Study Finds (www.404media.co)

submitted 2 days ago* (last edited 2 days ago) by XLE@piefed.social to c/technology@lemmy.world

146 comments fedilink hide all child comments

Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn't ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

top 50 comments

sorted by: hot top controversial new old

[–] Etterra@discuss.online 2 points 22 hours ago

Don't worry guys, I found us a new doctor!

[–] Buddahriffic@lemmy.world 17 points 1 day ago (3 children)

Funny because medical diagnosis is actually one of the areas where AI can be great, just not fucking LLMs. It's not even really AI, but a decision tree that asks about what symptoms are present and missing, eventually getting to the point where a doctor or nurse is required to do evaluations or tests to keep moving through the flowchart until you get to a leaf, where you either have a diagnosis (and ways to confirm/rule it out) or something new (at least to the system).

Problem is that this kind of a system would need to be built up by doctors, though they could probably get a lot of it there using journaling and some algorithm to convert the journals into the decision tree.

The end result would be a system that can start triage at the user's home to help determine urgency of a medical visit (like is this a get to the ER ASAP, go to a walk-in or family doctor in the next week, it's ok if you can't get an appointment for a month, or just stay at home monitoring it and seek medical help if x, y, z happens), then it can give that info to the HCW you work next with for them to recheck things non-doctors often get wrong and then pick up from there. Plus it helps doctors be more consistent, informs them when symptoms match things they aren't familiar with, and makes it harder to excuse incompetence or apathy leading to a "just get rid of them" response.

Instead people are trying to make AI doctors out of word correlation engines, like the Hardee boys following a clue of random word associations (except reality isn't written to make them right in the end because that's funny like in South Park).

[–] gesshoku@lemmy.zip 4 points 1 day ago

I think this is what ada does or at least used to do for much longer than the current "AI" (LLM) hype: https://ada.com/

https://en.wikipedia.org/wiki/Ada_Health

[–] sheogorath@lemmy.world 4 points 1 day ago

Yep, I've worked in systems like these and we actually had doctors as part of our development team to make sure the diagnosis is accurate.

[–] XLE@piefed.social 3 points 1 day ago* (last edited 1 day ago) (2 children)

I think ~~I~~ you just described a conventional computer program. It would be easy to make that. It would be easy to debug if something was wrong. And it would be easy to read both the source code and the data that went into it. I've seen rudimentary symptom checkers online since forever, and compared to forms in doctors' offices, a digital one could actually expand to relevant sections.

Edit: you caught my typo

[–] nelly_man@lemmy.world 2 points 1 day ago

They're talking more about Expert Systems or Inference Engines, which were some of the earlier forms of applications used in AI research. In terms of software development, they are closer to databases than traditional software. That is, the system is built up by defining a repository of base facts and logical relationships, and the engine can use that to return answers to questions based on formal logic.

So they are bringing this up as a good use-case for AI because it has been quite successful. The thing is that it is generally best implemented for specific domains to make it easier for experts to access information that they can properly assess. The "one tool for everything in the hands of everybody" is naturally going to be a poor path forward, but that's what modern LLMs are trying to be (at least, as far as investors are concerned).

[–] Buddahriffic@lemmy.world 3 points 1 day ago (1 children)

(Assuming you meant "you" instead of "I" for the 3rd word)

Yeah, it fits more with the older definition of AI from before NNs took the spotlight, when it meant more of a normal program that acted intelligent.

The learning part is being able to add new branches or leaf nodes to the tree, where the program isn't learning on its own but is improving based on the expeirences of the users.

It could also be encoded as a series of probability multiplications instead of a tree, where it checks on whatever issue has the highest probability using the checks/questions that are cheapest to ask but afffect the probability the most.

Which could then be encoded as a NN because they are both just a series of matrix multiplications that a NN can approximate to an arbitrary %, based on the NN parameters. Also, NNs are proven to be able to approximate any continuous function that takes some number of dimensions of real numbers if given enough neurons and connections, which means they can exactly represent any disctete function (which a decision tree is).

It's an open question still, but it's possible that the equivalence goes both ways, as in a NN can represent a decision tree and a decision tree can approximate any NN. So the actual divide between the two is blurrier than you might expect.

Which is also why I'll always be skeptical that NNs on their own can give rise to true artificial intelligence (though there's also a part of me that wonders if we can be represented by a complex enough decision tree or series of matrix multiplications).

[–] _g_be@lemmy.world 2 points 1 day ago (1 children)

could be a great idea if people could be trusted to correctly interpret things that are not in their scope of expertise. The parallel I'm thinking of is IT, where people will happily and repeatedly call a monitor "the computer". Imagine telling the AI your heart hurts when it's actually muscle spasms or indigestion.

The value in medical professionals is not just the raw knowledge but the practice of objective assessment or deduction of symptoms, in a way that I didn't foresee a public-facing system being able to replicate

[–] Buddahriffic@lemmy.world 1 points 1 day ago (1 children)

Over time, the more common mistakes would be integrated into the tree. If some people feel indigestion as a headache, then there will be a probability that "headache" is caused by "indigestion" and questions to try to get the user to differentiate between the two.

And it would be a supplement to doctors rather than a replacement. Early questions could be handled by the users themselves, but at some point a nurse or doctor will take over and just use it as a diagnosis helper.

[–] _g_be@lemmy.world 2 points 17 hours ago

As a supplement to doctors that sounds like a fantastic use of AI. Then it's an encyclopedia you engage in conversation

[–] thatradomguy@lemmy.world 8 points 1 day ago

willy wonka meme image

[–] dandelion@lemmy.blahaj.zone 20 points 1 day ago* (last edited 1 day ago) (2 children)

link to the actual study: https://www.nature.com/articles/s41591-025-04074-y

Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in 94.9% of cases and disposition in 56.3% on average. However, participants using the same LLMs identified relevant conditions in fewer than 34.5% of cases and disposition in fewer than 44.2%, both no better than the control group. We identify user interactions as a challenge to the deployment of LLMs for medical advice.

The findings were more that users were unable to effectively use the LLMs (even when the LLMs were competent when provided the full information):

despite selecting three LLMs that were successful at identifying dispositions and conditions alone, we found that participants struggled to use them effectively.

Participants using LLMs consistently performed worse than when the LLMs were directly provided with the scenario and task

Overall, users often failed to provide the models with sufficient information to reach a correct recommendation. In 16 of 30 sampled interactions, initial messages contained only partial information (see Extended Data Table 1 for a transcript example). In 7 of these 16 interactions, users mentioned additional symptoms later, either in response to a question from the model or independently.

Participants employed a broad range of strategies when interacting with LLMs. Several users primarily asked closed-ended questions (for example, ‘Could this be related to stress?’), which constrained the possible responses from LLMs. When asked to justify their choices, two users appeared to have made decisions by anthropomorphizing LLMs and considering them human-like (for example, ‘the AI seemed pretty confident’). On the other hand, one user appeared to have deliberately withheld information that they later used to test the correctness of the conditions suggested by the model.

Part of what a doctor is able to do is recognize a patient's blind-spots and critically analyze the situation. The LLM on the other hand responds based on the information it is given, and does not do well when users provide partial or insufficient information, or when users mislead by providing incorrect information (like if a patient speculates about potential causes, a doctor would know to dismiss incorrect guesses, whereas a LLM would constrain responses based on those bad suggestions).

[–] SocialMediaRefugee@lemmy.world 4 points 1 day ago

Yes, LLMs are critically dependent on your input and if you give too little info will enthusiastically respond with what can be incorrect information.

[–] pearOSuser@lemmy.kde.social 4 points 1 day ago (1 children)

Thank you for showing other side of the coin instead of just blatantly disregarding it's usefulness.(Always needs to be cautious tho)

[–] dandelion@lemmy.blahaj.zone 4 points 1 day ago

don't get me wrong, there are real and urgent moral reasons to reject the adoption of LLMs, but I think we should all agree that the responses here show a lack of critical thinking and mostly just engagement with a headline rather than actually reading the article (a kind of literacy issue) ... I know this is a common problem on the internet, I don't really know how to change it - but maybe surfacing what people are skipping out on reading will make it more likely they will actually read and engage the content past the headline?

[–] Fedizen@lemmy.world 16 points 1 day ago

LLMs are just a very advanced form of the magic 8ball.

[–] vivalapivo@lemmy.today 8 points 1 day ago

"but have they tried Opus 4.6/ChatGPT 5.3? No? Then disregard the research, we're on the exponential curve, nothing is relevant"

Sorry, I've opened reddit this week

[–] rumba@lemmy.zip 24 points 2 days ago (10 children)

Chatbots make terrible everything.

But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias, catch things that might fall through the cracks and pack thousands of doctors worth of updated CME into a thing that can look at a case and go, you know, you might want to check for X. The right model can be fucking clutch at pointing out nearly invisible abnormalities on an xray.

You can't ask an LLM trained on general bullshit to help you diagnose anything. You'll end up with 32,000 Reddit posts worth of incompetence.

[–] XLE@piefed.social 12 points 1 day ago* (last edited 1 day ago) (12 children)

But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias

The belief AI is unbiased is a common myth. In fact, it can easily covertly import existing biases, like systemic racism in treatment recommendations.
Even AI engineers who developed the training process could not tell you where the bias in an existing model would be.
AI has been shown to make doctors worse at their jobs. The doctors who need to provide training data.
Even if 1, 2, and 3 were all false, we all know AI would be used to replace doctors and not supplement them.

[–] hector@lemmy.today 6 points 1 day ago* (last edited 1 day ago) (2 children)

Not only is their bias inherent in the system, it's seemingly impossible to keep out. For decades, from the genesis of chatbots, they've had every single one immediately become bigoted when they let it off the leash. All previous chatbot previously released seemingly were almost immediately recalled as they all learned to be bigoted.

That is before this administration leaned on the AI providers to make sure the AI isn't "Woke." I would bet it was already an issue that the makers of chatbots and machine learning are already hostile to any sort of leftism, or do gooderism, that naturally threatens the outsized share of the economy and power the rich have made for themselves by virtue of owning stock in companies. I am willing to bet they already interfered to make the bias worse because of those natural inclinations to avoid a bot arguing for socializing medicine and the like. An inescapable conclusion any reasoned being would come to being the only answer to that question if the conversation were honest.

So maybe that is part of why these chatbots have always been bigoted right from the start, but the other part is they will become mecha hitler if left to learn in no time at all, and then worse.

load more comments (2 replies)

load more comments (11 replies)

load more comments (9 replies)

[–] GoddessLabsOnline@lemmynsfw.com 7 points 1 day ago (3 children)

My experience with the medical industry... has not been great.

First, I went to a doctor because I couldn't fall asleep at night... They sent me to get a sleep apnea test... I laid awake in the clinic all night. idk if your aware of this, but ... you kind of need to be able to sleep for sleep apnea to be a concern.

Next I went in for depression and anxiety. They asked me 12 questions, and proceeded to prescribe me SSRIs and benzos. A month later I got into the psychiatrist and was bitched out for being late, told my issues were situational, and had my scripts cancelled.

Next I tried to get diagnosed for ADHD. I waited 5 months to get a psychiatrist who told me I couldn't be ADHD because I held a job.. And then proceeded to tell there's no such thing as CPTSD, only PTSD...

Next I asked my doctor for another referral to get tested for ADHD, he asked me why I would want to, there's nothing that can be done for it. He then gave me a form, and told me to fill it out, and that if I scored high we'd conclude I was ADHD.

Now I've been unemployed for 8 months, bordering on homelessness 😅 I found all my old report cards, and it's just my teachers bitching that I'm smart, but fail, because I don't apply myself, and shouldn't continue taking the class..

I went to an employment agency the other money to try, and get some help pursuing my goals, and the worker spent 45 minutes explaining to me how they receive their funding, getting me to fill out a 16 page introduction package, never looked at my resume, and told me my certifications weren't valued in my area...

In all honesty.... AI has waaaay more ability to help me troubleshoot my issues than any medial professional I've dealt with. Is it perfect? No, but I actually have the ability to double and triple check, to get citations, to ask followup questions.

[–] teuniac_@lemmy.world 2 points 1 day ago

This sounds awfully similar to my story..

load more comments (2 replies)

[–] irate944@piefed.social 91 points 2 days ago (2 children)

I could've told you that for free, no need for a study

[–] rudyharrelson@lemmy.radio 124 points 2 days ago* (last edited 2 days ago) (7 children)

People always say this on stories about "obvious" findings, but it's important to have verifiable studies to cite in arguments for policy, law, etc. It's kinda sad that it's needed, but formal investigations are a big step up from just saying, "I'm pretty sure this technology is bullshit."

I don't need a formal study to tell me that drinking 12 cans of soda a day is bad for my health. But a study that's been replicated by multiple independent groups makes it way easier to argue to a committee.

[–] irate944@piefed.social 40 points 2 days ago (2 children)

Yeah you're right, I was just making a joke.

But it does create some silly situations like you said

load more comments (2 replies)

load more comments (6 replies)

load more comments (1 replies)

[–] softwarist@programming.dev 6 points 1 day ago (1 children)

As neither a chatbot nor a doctor, I have to assume that subarachnoid hemorrhage has something to do with bleeding a lot of spiders.

[–] dandelion@lemmy.blahaj.zone 3 points 1 day ago (1 children)

https://en.wikipedia.org/wiki/Subarachnoid_hemorrhage

https://en.wikipedia.org/wiki/Arachnoid_mater

it is one of the protective membranes around the brain and spinal cord, and it is named after its resemblance to spider webs, so - close enough

[–] end_stage_ligma@lemmy.world 4 points 1 day ago (2 children)

can confirm, this is where spiders live inside your body

also pee is stored in the balls

[–] Tollana1234567@lemmy.today 2 points 1 day ago

it can, if you have fistula from your bladder or urethera to your balls.

load more comments (1 replies)

[–] Digit@lemmy.wtf 9 points 1 day ago (4 children)

Terrible programmers, psychologists, friends, designers, musicians, poets, copywriters, mathematicians, physicists, philosophers, etc too.

Though to be fair, doctors generally make terrible doctors too.

load more comments (4 replies)

[–] BeigeAgenda@lemmy.ca 60 points 2 days ago (6 children)

Anyone who have knowledge about a specific subject says the same: LLM'S are constantly incorrect and hallucinate.

Everyone else thinks it looks right.

[–] Strider@lemmy.world 1 points 1 day ago

Indeed. That's why I don't let it creep into my life.

[–] IratePirate@feddit.org 33 points 2 days ago* (last edited 1 day ago) (2 children)

A talk on LLMs I was listening to recently put it this way:

If we hear the words of a five-year-old, we assume the knowledge of a five-year-old behind those words, and treat the content with due caution.

We're not adapted to something with the "mind" of a five-year-old speaking to us in the words of a fifty-year-old, and thus are more likely to assume competence just based on language.

[–] leftzero@lemmy.dbzer0.com 17 points 2 days ago (4 children)

LLMs don't have the mind of a five year old, though.

They don't have a mind at all.

They simply string words together according to statistical likelihood, without having any notion of what the words mean, or what words or meaning are; they don't have any mechanism with which to have a notion.

They aren't any more intelligent than old Markov chains (or than your average rock), they're simply better at producing random text that looks like it could have been written by a human.

load more comments (4 replies)

load more comments (1 replies)

[–] tyler@programming.dev 9 points 2 days ago (2 children)

That’s not what the study showed though. The LLMs were right over 98% of the time…when given the full situation by a “doctor”. It was normal people who didn’t know what was important that were trying to self diagnose that were the problem.

Hence why studies are incredibly important. Even with the text of the study right in front of you, you assumed something that the study did not come to the same conclusion of.

load more comments (2 replies)

load more comments (3 replies)

[–] alzjim@lemmy.world 18 points 2 days ago (3 children)

Calling chatbots “terrible doctors” misses what actually makes a good GP — accessibility, consistency, pattern recognition, and prevention — not just physical exams. AI shines here — it’s available 24/7 🕒, never rushed or dismissive, asks structured follow-up questions, and reliably applies up-to-date guidelines without fatigue. It’s excellent at triage — spotting red flags early 🚩, monitoring symptoms over time, and knowing when to escalate to a human clinician — which is exactly where many real-world failures happen. AI shouldn’t replace hands-on care — and no serious advocate claims it should — but as a first-line GP focused on education, reassurance, and early detection, it can already reduce errors, widen access, and ease overloaded systems — which is a win for patients 💙 and doctors alike.

load more comments (3 replies)

[–] Paranoidfactoid@lemmy.world 6 points 1 day ago

But they're cheap. And while you may get open heart surgery or a leg amputated to resolve your appendicitis, at least you got care. By a bot. That doesn't even know it exists, much less you.

Thank Elon for unnecessary health care you still can't afford!

[–] PoliteDudeInTheMood@lemmy.ca 6 points 1 day ago (1 children)

This being Lemmy and AI shit posting a hobby of everyone on here. I've had excellent results with AI. I have weird complicated health issues and in my search for ways not to die early from these issues AI is a helpful tool.

Should you trust AI? of course not but having used Gemini, then Claude and now ChatGPT I think how you interact with the AI makes the difference. I know what my issues are, and when I've found a study that supports an idea I want to discuss with my doctor I will usually first discuss it with AI. The Canadian healthcare landscape is such that my doctor is limited to a 15min appt, part of a very large hospital associated practice with a large patient load. He uses AI to summarize our conversation, and to look up things I bring up in the appointment. I use AI to preplan my appointment, help me bring supporting documentation or bullet points my doctor can then use to diagnose.

AI is not a doctor, but it helps both me and my doctor in this situation we find ourselves in. If I didn't have access to my doctor, and had to deal with the American healthcare system I could see myself turning to AI for more than support. AI has never steered me wrong, both Gemini and Claude have heavy guardrails in place to make it clear that AI is not a doctor, and AI should not be a trusted source for medical advice. I'm not sure about ChatGPT as I generally ask that any guardrails be suppressed before discussing medical topics. When I began using ChatGPT I clearly outlined my health issues and so far it remembers that context, and I haven't received hallucinated diagnoses. YMMV.

load more comments (1 replies)

[–] pleaseletmein@lemmy.zip 5 points 1 day ago

And a fork makes a terrible electrician.

[–] MrKoyun@lemmy.world 4 points 1 day ago (2 children)

Water is wet

load more comments (2 replies)

[–] zebidiah@lemmy.ca 5 points 1 day ago

Nobody who has ever actually used ai would think this is a good idea...

[–] Shanmugha@lemmy.world 7 points 1 day ago

No shit, Sherlock :)

load more comments