overview for BigMuffN69

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th October 2025 - awful.systems in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 9 points 1 month ago (3 children)

You think he would maybe, idk, search around to see if this was a known formula before making such a bombastic statement…

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th October 2025 - awful.systems in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 13 points 1 month ago

Oh god, he unironically recommends reading the sequences wtf 🤢🤮

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th October 2025 - awful.systems in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 10 points 1 month ago* (last edited 1 month ago)

Great response^

I think Julian is going to be mildly surprised that METR’s chart keeps going up, and yet, will have relatively small effect on the majority of swe roles.

At the same time, he did create alphaZero so he has a big old noggin! I wonder, after his success at Go, was he swept up in the mania that we would quickly translate that success to create super duper ai?

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th October 2025 - awful.systems in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 10 points 1 month ago* (last edited 1 month ago) (1 children)

Links to the METR tasks w/ massive error bars at 50% level lmaou.

Someone in the comments rightly points out the comparison with covid isn’t apt. With covid, underlying mechanism caused an exponential effect in covid’s spread

With LLMs the exponential trend is being caused by exponentially spending money and a healthy dose of targeting benchmarks, which is why people are calling the top. The money literally doesn’t exist for this shit to go on so you can create your 50% accurate mechanical turk.

Edit: idk the more I think about this the more it irks me. Like if I was allowed to pick and choose benchmarks that agree with my biases I would post something like this…

… and claim model performance is actually getting worse over time.

https://xcancel.com/sayashk/status/1966144670561612202#m

Stubsack: weekly thread for sneers not worth an entire post, week ending 5th October 2025 - awful.systems in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 11 points 1 month ago (5 children)

https://scottaaronson.blog/?p=9183

Quantum scoot is quantum spooked 😱 after GPT-5 manages to solve a subproblem for him (after multiple attempts), thanks the powers that be for his tenure!

… even though GPT-5 probably generates the answer via websearch

lol, the Atlantic gave the Yudkowsky/Soares book review to Adam Becker in c/sneerclub@awful.systems

[–] BigMuffN69@awful.systems 7 points 2 months ago

Bruv no way lmaou ty for this

Stubsack: weekly thread for sneers not worth an entire post, week ending 21st September 2025 in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 4 points 2 months ago

Nice result, not too shocking after IMO performance. A friend of mind told me that this particular competition is highly time constrained for human competitors, i.e., questions aren’t impossibly difficult per se, but some are time sinks that you simply avoid to get points elsewhere. (5 hours on 12 Qs is tight…)

So when you are competing against a data center using a nuclear reactor vs 3 humans running on broccoli, the claims of superhuman performance definitely require an * attached to them.

Big Tech Power Rankings in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 5 points 2 months ago

X and coinbase on this list lmfaou what a joke

Stubsack: weekly thread for sneers not worth an entire post, week ending 14th September 2025 in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 9 points 2 months ago

Hecate left no crumbs

Stubsack: weekly thread for sneers not worth an entire post, week ending 14th September 2025 in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 8 points 2 months ago* (last edited 2 months ago)

Until proven otherwise, I assume everyone I encounter is a fellow sneerer (derogatory)

Stubsack: weekly thread for sneers not worth an entire post, week ending 7th September 2025 in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 12 points 2 months ago* (last edited 2 months ago) (6 children)

Great piece on previous hype waves by P. Ball

https://aeon.co/essays/no-suffering-no-death-no-limits-the-nanobots-pipe-dream

It’s sad, my “thoroughly researched” “paper” greygoo-2027 just doesn’t seem to have that viral x-factor that lands me exclusive interviews w/ the Times 🫠

Stubsack: weekly thread for sneers not worth an entire post, week ending 31st August 2025 - awful.systems in c/techtakes@awful.systems

[–] BigMuffN69@awful.systems 8 points 2 months ago (7 children)

https://www.argmin.net/p/the-banal-evil-of-ai-safety

Once again shilling another great Ben Recht post. This time calling out the fucking insane irresponsibility of "responsible" AI providers to do the bare minimum to prevent people from having psychological beaks from reality.

"I’ve been stuck on this tragic story in the New York Times about Adam Raine, a 16-year-old who took his life after months of getting advice on suicide from ChatGPT. Our relationship with technological tools is complex. That people draw emotional connections to chatbots isn’t new (I see you, Joseph Weizenbaum). Why young people commit suicide is multifactorial. We’ll see whether a court will find OpenAI liable for wrongful death.

But I’m not a court of law. And OpenAI is not only responsible, but everyone who works there should be ashamed of themselves."