One more comment, idk if ya'll remember that forecast that came out in April(? iirc ?) where the thesis was the "time an AI can operate autonomously is doubling every 4-7 months." AI-2027 authors were like "this is the smoking gun, it shows why are model is correct!!"
They used some really sketchy metric where they asked SWEs to do a task, measured the time it took and then had the models do the task and said that the model's performance was wherever it succeeded at 50% of the tasks based on the time it took the SWEs (wtf?) and then they drew an exponential curve through it. My gut feeling is that the reason they choose 50% is because other values totally ruin the exponential curve, but I digress.
Anyways they just did the metrics for Claude 4, the first FrOnTiEr model that came out since they made their chart and... drum roll no improvement... in fact it performed worse than O3 which was first announced last December (note instead of using the date O3 was announced in 2024, they used the date where it was released months later so on their chart it make 'line go up'. A valid choice I guess, but a choice nonetheless.)
This world is a circus tent, and there still aint enough room for all these fucking clowns.





Yeah, METR was the group that made the infamous AI IS DOUBLING EVERY 4-7 MONTHS GRAPH where the measurement was 50% success at SWE tasks based on the time it took a human to complete it. Extremely arbitrary success rate, very suspicious imo. They are fanatics trying to pinpoint when the robo god recursive self improvement loop starts.