this post was submitted on 07 May 2026
239 points (85.7% liked)

Technology

84478 readers
5172 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] FauxLiving@lemmy.world 3 points 1 day ago* (last edited 1 day ago) (1 children)

Here is the paper: https://ai-project-website.github.io/AI-assistance-reduces-persistence/

No the test is not training, that’s a weird thing to claim.

The control group solved 12 questions manually and then the 3 test questions manually. The AI grouped solved 0 questions manually and the 3 test questions manually. One group had 12 more manual math tasks to prepare for the manual math test the other group had 0 and also had to context switch.

The AI-assisted group was dealt a context switch, which results in a pretty severe performance loss. A context switch causes performance loss of around 40% according to this paper, which was peer-reviewed and published and is also the most cited paper on the topic, in the APA: https://www.apa.org/pubs/journals/releases/xhp274763.pdf

The AI-assisted group also did not have 12 questions to adjust to the new context, like the control group did. If they wanted to wipe out the context switching performance loss they should have kept asking questions to see if, after 12 questions, the AI-assisted group had a similar performance.

The switch is what is tested, and you disregard that 2 other tests have shown similar results.

No, they did not switch what was tested. Here is an image from the actual paper.

They were given 12 tasks with one group using AI and another doing mental math and then 3 tasks doing mental math. One group had 12 more tasks worth of preparation than the other.

Nothing, not even the article in theOP, says that they did math and swapped to reading to test.

They did 3 different experiments, in each experiment they gave 12 tasks and then disabled the AI for one group and gave 3 more tasks as a test. At no point did they ask 12 math questions and then finish with 3 reading questions or vice versa. They did 2 experiments using math tasks and 1 experiment using reading comprehension tasks.

So one group had 15 math tasks and one group had 12 'how to ask an AI' tasks and then 3 math questions.

They also did not control for context switching losses, which is a well documented (see the APA paper) effect. The proper control would be to continue asking questions so the AI group also had 12 math tasks before the test.

There's a reason that this is published on arXiv and not in a peer-reviewed journal. Designing a poor quality experiment doesn't tell you anything useful even if you do multiple different versions of the same experiment.

This paper demonstrates a lack of a proper control group, specifically a failure to control for context switching performance loss.

[–] Buffalox@lemmy.world 0 points 1 day ago (2 children)

The picture you post contradict your claims. The 2 groups are getting the same question, but one has AI assistance, the other has not.
Again you fail to show anything to support your claims.

[–] sockenklaus@sh.itjust.works 2 points 1 day ago (1 children)

No, what they meant is: The control group had 12 questions to get into the flow of solving math problems and then solved three more math problems for good measure.

The AI group on the other hand got into the flow of formulating math problems to ChatGPT and then had to actually solve three math problems themselves

Their critique is, that solving math problems yourself and prompting ChatGPT to solve math problems are not necessarily comparable tasks and require different skill sets so disabling AI after 12 tasks meant the first group had to switch context and therefore had worse performance.

If you want to analyze the first groups general ability of problem solving you should give them again twelve tasks after disabling AI so they get used to this new type of task (solving math problems yourself vs. prompting math problems to the AI) before measuring their performance.

[–] Buffalox@lemmy.world 1 points 1 day ago

The AI group on the other hand got into the flow of formulating math problems to ChatGPT and then had to actually solve three math problems themselves

That's what the friggin test is about! So of course they did.

[–] FauxLiving@lemmy.world 4 points 1 day ago (1 children)

I also wrote text.

If you're just going to cherry pick a single point and dismiss everything else then we're done here.

[–] frongt@lemmy.zip 3 points 1 day ago (1 children)

Maybe they're unable to switch contexts

[–] FauxLiving@lemmy.world 2 points 1 day ago

I hear that can cause a loss of performance.