this post was submitted on 19 Mar 2026

338 points (97.2% liked)

Technology

82940 readers

2816 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

338

AI still doesn't work very well, businesses are faking it, and a reckoning is coming (www.theregister.com)

submitted 3 days ago by FoxtrotDeltaTango@sh.itjust.works to c/technology@lemmy.world

40 comments fedilink hide all child comments

cross-posted from: https://lemmy.ca/post/61948688

Excerpt:

"Even within the coding, it's not working well," said Smiley. "I'll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven't engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."

Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.

"We don't know what those are yet," he said.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That's the kind of thing that needs to be assessed to determine whether AI helps an organization's engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

"It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless."

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

"Coding works if you measure lines of code and pull requests," he said. "Coding does not work if you measure quality and team performance. There's no evidence to suggest that that's moving in a positive direction."

all 42 comments

sorted by: hot top controversial new old

[–] justsomeguy@lemmy.world 82 points 3 days ago (3 children)

Being in an economic bubble during the age of (over)information is really weird. We're getting two articles per day confirming that we're in a big ass bubble but it just keeps on going. I preferred not really knowing how bad things are.

[–] grue@lemmy.world 36 points 3 days ago (1 children)

Everybody knew dot-coms in 2000 and houses in 2007 were bubbles, too. But they kept investing anyway, because they didn't know when it would pop and FOMO is a helluva drug.

Also, something to keep in mind: https://awealthofcommonsense.com/2014/02/worlds-worst-market-timer/

[–] thisbenzingring@lemmy.today 14 points 3 days ago (2 children)

prepare for the burst so you can jump in and get the deals of a lifetime

[–] greyscale@lemmy.sdf.org 33 points 3 days ago (2 children)

With what fucking capital, bro. We've been squeezed already.

[–] thisbenzingring@lemmy.today 13 points 3 days ago* (last edited 3 days ago)

my wife and I didn't have but a couple grand saved up, in 2011 we bought our first house at the beginning of the end of the 2008 bubble. We got a house that was extremely cheap compared to the value

when the opportunity comes, make sure to take it

[–] anyhow2503@lemmy.world 9 points 3 days ago (1 children)

Let's pool together and buy shares for a hundred bucks.

[–] leoj@piefed.zip 4 points 3 days ago

I've got a fiver to throw in!!

[–] greyscale@lemmy.sdf.org 3 points 3 days ago

Its this kind of thinking that keeps the market irrational for longer than it should be.

[–] WolfmanEightySix@piefed.social 6 points 3 days ago

Yep. We’ve been waiting on a big crash for three years now.

[–] Tamps@feddit.uk 55 points 3 days ago (1 children)

Or to put it another way, AI is making it faster and easier to do the wrong thing in the wrong way at scale.

I also wonder what the plan is when the token cost starts going upward. The bill for all this venture capital will come due eventually and someone has to pay for it.

[–] RememberTheApollo_@lemmy.world 8 points 3 days ago (1 children)

Or to put it another way, AI is making it faster and easier to do the wrong thing in the wrong way at scale.

…and absolve the operator of any responsibility, apparently.

Oh, that was the AI doing something wrong. shrugs and just keeps doing what they’re doing.

[–] reddig33@lemmy.world 1 points 3 days ago* (last edited 3 days ago)

“Oh that was the hammer that hit you in the noggin, not me. Why yes, I did pay for that hammer, and I did hold it, and I did swing it in your direction, but it’s totes not my fault.”

[–] rizzothesmall@sh.itjust.works 44 points 3 days ago (2 children)

AI works great. I work in the sphere of production defect detection in manufacturing and it's been working pretty well for a decade or more to predict machine failures and spot defective materials or products.

LLMs as business digital yes man is what doesn't work.

[–] chuckleslord@lemmy.world 15 points 3 days ago

Yeah, unfortunately the marketing people have made the LLM synonymous to AI. It's a damn shame.

[–] Naia@lemmy.blahaj.zone 6 points 3 days ago

LLMs have a use case, it's just really limited and the vast majority of what companies, and people broadly, use it for is either not the best case for it or not even something it can/should do.

If you know how to use them they can save time. You still need to validate everything it gives you, but as a developer I can use one to generate small code snippets or give it documentation and ask questions as a quick reference.

But these are not automation tools. They are not worker replacements. and they aren't replacements for research even if they can get you started on research..

LLMs, and neural nets in general, can never be AGI no matter how much companies wish it could be.

[–] thebestaquaman@lemmy.world 50 points 3 days ago (2 children)

It’s 3.7x more lines of code that performs 2,000 times worse than the actual SQLite.

Pretty much my experience with LLM coding agents. They'll write a bunch of stuff, and come with all kinds of arguments about why what they're doing is in fact optimal and perfect. If you know what you're doing, you'll quickly find a bunch of over-complicating things and just plain pitfalls. I've never been able to understand the people that claim LLMs can build entire projects (the people that say stuff like "I never write my own code anymore"), since I've always found it to be pretty trash at anything beyond trivial tasks.

Of course, it makes sense that it'll elaborate endlessly about how perfect its solution is, because it's a glorified auto-complete, and there's plenty of training data with people explaining why "solution X is better".

[–] Dojan@pawb.social 25 points 3 days ago (2 children)

I saw a vibe coded PR the other day. So much redundant code, lots of comments making assumptions and questions. It’s a mess.

Glad it didn’t land in my lap but the person who is now responsible for steering that up is already quite busy and wasting their time with this feels shit.

[–] thebestaquaman@lemmy.world 13 points 3 days ago

One of the worst things about this is that the person vibe coding just ends up shitting on the reviewers time. Like... you couldn't even bother to write a real PR, and now you want me to spend time filtering your shit? Fuck off.

[+] org@lemmy.org -13 points 3 days ago (1 children)

To many people don’t know how to prompt AI, and review.

[–] Dojan@pawb.social 16 points 3 days ago (1 children)

Too many people are willingly paying anti-democratic billionaires to outsource their thinking and agency.

[+] org@lemmy.org -10 points 3 days ago (1 children)

Too many people know their job is only going to last 6 months before the next round of layoffs, and that talent and hard work has never been the way to keep a job in the tech industry… so why try?

[–] Dojan@pawb.social 7 points 3 days ago (1 children)

Not really a valid excuse in this case as we aren’t really experiencing layoffs here. Au contraire, our company is hiring. I’m not in the U.S.

Still think that letting language models controlled by billionaire paedophiles and wannabe dictators is a poor idea, regardless of how fed up one is with one’s job.

[+] org@lemmy.org -7 points 3 days ago (2 children)

Where is “here?”

And, if you want to bring pedophiles into it, most of what you touch on a daily basis involved a billionaire pedophile at some point. You just sound lazy at this point.

[–] EncryptKeeper@lemmy.world 8 points 3 days ago* (last edited 3 days ago) (1 children)

Did you just call them lazy after making the argument that building talent and hard work is not worth doing in the modern tech landscape?

[–] org@lemmy.org -4 points 3 days ago

The argument is lazy to blanket everything in “pedophile” instead of actually talking about the issue.

[–] Dojan@pawb.social 1 points 3 days ago (1 children)

Oh absolutely, and we can do our best to swear off of that but thanks to them worming their way in like a cancer in every part of society, shaping it to benefit them, that's just the nature of taking part in society. The ones in power have always, and will always continue to exploit us for as long as we let them.

All the more reason to not outsource our thinking to their machines. Governments are already doing it, getting caught red-handed acting on reports that never existed. Why rely on that when the option not to is so readily available?

[–] org@lemmy.org -4 points 3 days ago (1 children)

Ehhh… this sounds more like blanket ai-hate and less about you actually caring. You’re already in their cloud. I doubt you run bare metal. You probably use GitHub. Etc. caring on one hand and not on the other means nothing.

I’ll continue farming out bullshit tasks to AI while I play with my cat and prepare for the next round of layoffs, rather than giving my soul to a company who doesn’t actually care about me.

[–] Dojan@pawb.social 1 points 3 days ago

I actually really love machine learning. I trained my own language models back in the late 201Xs, and at my previous company I worked on image classification for a smaller photography platform they were developing. I'm not an ML expert, but it's easy to see the bullshit that the "AI" companies are selling as bullshit when you know the foundations of the tech. A little like how you don't need to be a surgeon to call bullshit on someone saying that they performed open heart surgery and brain surgery on themselves simultaneously.

You’re already in their cloud. Yes, unfortunately. There was a time when I was a lot more naïve and way less critical. People change. This is what "radicalised" me. I know that a story like that is just daily life in the U.S., but despite my cynicism I thought things were better here.

You probably use GitHub. I don't. I self-host everything that I can, and make careful choices with what I choose not to. I've left what I had on GitHub on there, and I'll probably use it as a mirror whenever I release a FOSS project, because I'm OK with Microsoft paying the hosting costs for me, if they're going to try and scrape my shit anyway. You know, just like GNOME is doing.

caring on one hand and not on the other means nothing.

I don't really agree. This is the same idiotic take as "if you hate capitalism so much, why are you partaking in it?" You're also speaking directly through your sphincter as you've no idea what my life is like, what choices I've made, and so forth; you don't know me.

Would I prefer to never have to engage with a payment processor or a bank again? Absolutely. That however, is sadly impossible in my society. LLMs aren't integral to society yet, and I'd see them continue that way.

I’ll continue farming out bullshit tasks to AI while I play with my cat and prepare for the next round of layoffs, rather than giving my soul to a company who doesn’t actually care about me.

Not a fan of the LLM part, but I love the overall sentiment. Fuck the corporations, your time and energy is better spent on the people you love. I hope things work out for you.

~Animals are people too. Lots of love to your cat.~

~I love cats.~

[–] thisbenzingring@lemmy.today 10 points 3 days ago (2 children)

I tried using an LLM for making an 3d object in openscad, an open source CAD app for making 3d printable objects

its basic and uses an open source language. The LLM should have infinate examples and access

but after 4 tries I gave up and just did it myself, sure the crap the LLM gave me helped form a general setup but I had to spend 2x as much time fixing the code then it did writing it from scratch

I haven't tried using LLM for anything else, that failure told me everything I needed to know about its ability to do basic shit

[–] JeeBaiChow@lemmy.world 2 points 3 days ago

This. If users are spending so much time explaining in detail to an llm what they want the output to do, theyre better off doing it themselves. Code snippets were already solved with search.

[–] grue@lemmy.world 2 points 3 days ago

I have never heard of any generative AI system capable of doing anything useful with 3D models. If you ever find one, PM me to let me know!

[–] floofloof@lemmy.ca 21 points 3 days ago (2 children)

This article says that the AI-coded Rust rewrite of SQLite ran 2,000 times slower, but the linked source article says it ran more than 20,000 times slower. Muddling up 2,000 and 20,000 seems a bit sloppy for journalism about code performance.

[–] Codpiece@feddit.uk 20 points 3 days ago (1 children)

Maybe they were using AI to write it.

[–] leoj@piefed.zip 4 points 3 days ago

it was actually 200,000 times slower!

[–] ratsnake@lemmy.blahaj.zone 12 points 3 days ago* (last edited 3 days ago) (1 children)

The article cited by The Register cites this more detailed analysis in turn: https://blog.katanaquant.com/p/your-llm-doesnt-write-correct-code

Performance of the AI-generated version was 20,000 times slower on one specific benchmark, but "only" about 2,000 times slower when averaging over multiple different benchmarks (which is, imo, a better measurement of the code's quality).

So I suppose The Register pulled from multiple sources (as you should) and just linked to the most top-level of all of them.

[–] floofloof@lemmy.ca 4 points 3 days ago

Thanks for that link. It has a lot more detail.

[–] ElectricAirship@lemmy.dbzer0.com 13 points 3 days ago

Companies and governments told us that an energy transition is "too costly" or "too disruptive to society" but when it comes to AI disrupting and even ending people's lives...

They just say, "deal with it."

[–] RedGreenBlue@lemmy.zip 8 points 3 days ago* (last edited 3 days ago) (1 children)

Ai is currently a glorified search engine. But expensive.

[–] scarabic@lemmy.world 4 points 3 days ago* (last edited 3 days ago)

Yeah. I use it at work as a glorified search engine of all company wikis and docs and tickets. It’ll work basically just as a search or it can also summarize - not terribly.

Our AI meeting notes are also pretty good. I’m always impressed at how it leaves out personal chit-chat and anything negative we say about someone who isn’t present.

[–] Deestan@lemmy.world 6 points 3 days ago

No bro the new model from 3 months ago is infinite gooder than what they tested. In 12-18 months we get agi or some shit i dunno just 10 more billions$ bro.