this post was submitted on 21 Dec 2025
231 points (97.1% liked)
Fuck AI
5206 readers
743 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
AI, in this case, refers to LLMs, GPT technology, and anything listed as "AI" meant to increase market valuations.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Reading the article the issue was that there was some difference in the cells in the sample. This difference was not in the training data, and so the AI fell over at what appears to be a cellular anomaly it had never seen before. Because it's looking for any deviations, to catch things a human would be totally oblivious to, it tried to analyze a pattern without training data that fits.
After the fact, the anomaly when analyzed by humans with broader experience and reasoning determined it to be a red herring correlated to race not cancer. The model has no access to racial/ethnicity data, it was simply influenced by a data point there was inadequate data on. For all its data, it could have been a novel consequence of that sample's cancer that it, rightfully, should keep it from identifying it as something lacking this observed phenomenon. The article said it failed to identify the subtype of cancer, but didn't say of it would, for example, declare it benign. If the result was "unknown cancer, human review required", that would be a good given the data. If the outcome was "no worries, no dangerous cancer detected" that would be bad, and this article doesn't clarify which case we are talking about.
As something akin to machine vision, it's even "dumber" than LLM, the only way to fix this is by securing a large volume of labeled training data to change the statistical model to ignore those cellular differences. If it flagged unrecognized phenomenon as "safe", that would more likely be a matter of traditional programming to assign a different value statement for low confidence output from the model.
So the problem was less that the model was "racist", it is that the training data was missing a cellular phenomenon.
The non LLM AI are actually useful and relatively cheap on the Inference end. It's the same stuff that has run for years on your phone to find faces while getting ready to take a picture. The bad news maybe be that LLM "acts" smarter (but it's still fundamentally stupid in risky ways), but the good news is that with these "old fashioned" machine learning models are more straightforward and you know what you get or don't get.
I actually think this is kind of a case of overfitting. The AI is factoring in extra data to the analysis that isn't the important variables.
The difference was in the training data, it was just less common.
This is like if someone who knows about swans, and knows about black birds, saw a black swan for the first time and said "I don't know what bird that is". It's assuming the whiteness is important when it isn't.
Thanks for the quick breakdown. I checked in here because the headline is sus AF. The phrase "race data" is suspect enough and reeks of shitty journalism. But the idea that a program for statistical analysis of images can "be racist" is so so dumb and anthropomorphizing. I'm the kind of person who thinks everything is racist, but this article is just more grifter bullshit.