I just want to point out that every single heavily downvoted, idiotic pro-AI reply on this post is from a .ml user (with one programming.dev thrown in).
I wonder which way the causation flows.
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
I just want to point out that every single heavily downvoted, idiotic pro-AI reply on this post is from a .ml user (with one programming.dev thrown in).
I wonder which way the causation flows.
From the blog post referenced:
We do not provide evidence that:
AI systems do not currently speed up many or most software developers
Seems the article should be titled “16 AI coders think they’re 20% faster — but they’re actually 19% slower” - though I guess making us think it was intended to be a statistically relevant finding was the point.
That all said, this was genuinely interesting and is in-line with my understanding of the human psychology that’s at play. It would be nice to see this at a wider scale, broken down across different methodologies / toolsets and models.
I have an LLM usage mandate in my performance review now. I can’t trust it to do anything important, so I’ll get it to do incredibly noddy things like deleting a clause (that I literally always have highlighted) or generate documentation that’s more long-winded than just reading the code and then go to the bathroom while it happens.
Are you fucking serious?
this sort of bloody stupid metric is widespread, i've heard about it widely
For each time saved, you're having that one kink that will slow you down by a fuck ton, something that AI just can't get right, something that takes ai 5 hours to fix but would've taken you 10-20 to write from scratch
Anyone who has had to unfuck someone else’s work knows it would have been faster to do the work correctly from scratch the first time.
@dgerard I normally consider myself a 10x developer. With the 10x speedup of AI I now consider myself a 100x developer. I can replace an entire small business worth of developers with just myself and my LLM bot assistance. Just pay me $100 million up front no strings and I'll prove it to you! /s
Something something grindset mindset
Mark Zuckerberg would like to know your location
I have the deal of a lifetime for you.
I represent a group of investors in possession of a truly unique NFT that has been recently valued at over $100M. We will invest this NFT in your 100x business - in return you transfer us the difference between the $100M investment and the excess value of the NFT. Standard rich people stuff, don’t worry about it.
Let me know when you’re ready to unlock your 100x potential and I’ll make our investment available via a suitable escrow service.
@dgerard@awful.systems who is your illustrator? These are consistently great.
these are stock images! Which are surprisingly cheap. By Valeriy Kachaev, who puts stuff up as Studiostoks on a pile of stock image sites. His pics are bizarre and keep being the perfect thing.
ahahaha holy shit. I knew METR smelled a bit like AI doomsday cultists and took money from OpenPhil, but those "open source" projects and engineers? One of them was LessWrong.
Here's a LW site dev whining about the study, he was in it and i think he thinks it was unfair to AI
I think if people are citing in another 3 months time, they'll be making a mistake
dude $NEXT_VERSION will be so cool
so anyway, this study has gone mainstream! It was on CNBC! I urge you not to watch that unless you have a yearning need to know what the normies are hearing about this shit. In summary, they are hearing that AI coding isn't all that actually and may not do what the captains of industry want.
around 2:30 the two talking heads ran out of information and just started incorrecting each other on the fabulous AI future, like the worst work lunchroom debate ever but it's about AI becoming superhuman
the key takeaway for the non techie businessmen and investors who take CNBC seriously ever: the bubble starts not going so great
Yeah, METR was the group that made the infamous AI IS DOUBLING EVERY 4-7 MONTHS GRAPH where the measurement was 50% success at SWE tasks based on the time it took a human to complete it. Extremely arbitrary success rate, very suspicious imo. They are fanatics trying to pinpoint when the robo god recursive self improvement loop starts.
Devs are famously bad at estimating how long a software project will take.
No, highly complex creative work is inherently extremely difficult to estimate.
Anyway, not shocked at all by the results. This is a great start that begs for larger and more rigorous studies.
You're absolutely correct that the angle approach that statement is bullshit. There is also that they want to think making software is not highly complex creative work but somehow is just working an assembly line and the software devs are gatekeepers that don't deserve respect.
As someone that has had to double check peoples code before, especially those that don't comment appropriately, I'd rather just write it all again myself than try and decipher what the fuck they were even doing.
Megacorp LLM death spiral:
I've been through the hellscape where managers used missed metrics as evidence for why we didn't need increased headcount on an internal IT helpdesk.
That sort of fuckery is common when management gets the idea in their head that they can save money on people somehow without sacrificing output/quality.
I'm pretty certain they were trying to find an excuse to outsource us, as this was long before the LLM bubble we're in now.
oh, absolutely. I mean you could sub out "LLM" with any bullshit that management can easily spring on their understaff. Agile, standups, return to office, the list goes on. Management can get fucked
I wish I could make more people both know about, and understand, Goodhart’s law
5% "coding"
95% cleanup
The N=16 keeps getting buried. Deliberate?
this user has been removed for commenting without reading the article
being from programming dot dev is just the turd on top
programming.dev: statistical sampling excellency (worst edition)
programmers learned what N means in statistics and immediately realized that “this N is too small” is a cool shortcut to sounding smart without reading the study, its goals, or its conclusions. and you can use it every time N is smaller than the human population on earth!
This N is too small: ~N~
The colon-space-subscript bothers me Immensely
Skill issue - this N is even smaller:
spoiler
You're acting like this is a gotcha when it's actually probably the most rigorous study of AI tool productivity change to date.
Paragraph 2:
METR funded 16 experienced open-source developers with “moderate AI experience” to do what they do.
... and just a few paragraphs further down:
The number of people tested in the study was n=16. That’s a small number. But it’s a lot better than the usual AI coding promotion, where n=1 ’cos it’s just one guy saying “I’m so much faster now, trust me bro. No, I didn’t measure it.”
I wouldn't call that "burying information".
. Debate me bro? (jk)