this post was submitted on 07 Nov 2025
243 points (99.2% liked)
Fuck AI
4619 readers
848 users here now
"We did it, Patrick! We made a technological breakthrough!"
A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You might want to read the actual report then.
You'll find that the second study was conducted in May/June 2025 and you'll find the model versions, which were the available free options at the time (page 20)
Also the sourcing errors found where not based on the question which source was selected (aka a bias in sourcing as you seem to imply) but the report explicitly states this:
GPT 4o and Gemini Flash were not "heavily outdated" at the time when the study was conducted, because these were the provided models in the free version which they used (page 20 and page 62).
The goal of the study is not to find the best performing model or to compare the performance of different models, but to use the publicly available AI offerings like a normal consumer would be able to. You might get better results by using a paid pro model or a specialized model of some kind but that's not the point here.