PhilipTheBucket

joined 3 months ago
MODERATOR OF
[–] PhilipTheBucket@piefed.social 9 points 2 months ago (4 children)

If we assume this random YouTube man is credible, and I see no reason to doubt him, the DOJ can't even tie suspected Kirk killer to the actual killing of Charlie Kirk.

[–] PhilipTheBucket@piefed.social 37 points 2 months ago (15 children)

Some kinds of progressive don’t see electoralism as worth their time.

...

You gotta do that last year, or a little while later, after people have forgotten. Trying to say it doesn't matter who wins elections right now is... not going to be convincing.

It's actually exactly like what happened with vaccines. We had so many years living in a society which didn't have active urgent throw-you-in-the-camps-for-no-reason tyranny that people stopped believing it was really real, and they're still out confidently saying it's not worth taking basic easy steps to prevent.

[–] PhilipTheBucket@piefed.social 2 points 2 months ago (1 children)

Yeah, I get it. I don't think it is necessarily bad research or anything. I just feel like maybe it would have been good to go into it as two papers:

  1. Look at the funny LLM and how far off the rails it goes if you don't keep it stable and let it kind of "build on itself" over time iteratively and don't put the right boundaries on
  2. How should we actually wrap up an LLM into a sensible model so that it can pursue an "agent" type of task, what leads it off the rails and what doesn't, what are some various ideas to keep it grounded and which ones work and don't work

And yeah obviously they can get confused or output counterfactuals or nonsense as a failure mode, what I meant to say was just that they don't really do that as a response to an overload / "DDOS" situation specifically. They might do it as a result of too much context or a badly set up framework around them sure.

[–] PhilipTheBucket@piefed.social 10 points 2 months ago

More or less as soon as I typed it I realized, you know what, that's a stupid question, I would be very very surprised if they don't get paid.

I still feel like the fact that it impacts their workplace directly, is the reason they freak the fuck out about it and start actually trying, in a way they usually don't when someone else is getting kidnapped to an ICE facility or losing their workplace or home or family or life.

[–] PhilipTheBucket@piefed.social 14 points 2 months ago (3 children)

Does congress still get paid during a shutdown?

I genuinely don't know the answer to that question, but if the answer is "no," then I think we've uncovered a significant clue. Personally I feel like a huge part of the top Democrats' horrifying fecklessness on all of these type of issues is that, at the end of the day, they'll be fine, and you can sometime divine their priorities by seeing what does make them start sweating and working hard to avoid.

[–] PhilipTheBucket@piefed.social 9 points 2 months ago (2 children)

Yeah. The aftermath was pretty telling too, about all kinds of things.

  1. NATO didn't back up Türkiye on the issue, which kind of put Türkiye in a fuckin' rough spot. This led to a little bit of a falling out, and is indicative that NATO is full of old ministers in safe offices who don't have a real strong concept of loyalty among a few other things.
  2. Erdogan turned on the pilots involved and made them the scapegoat shortly after, which is indicative that Erdogan's a piece of shit.
  3. Russia didn't mind at all. After Erdogan made some performative gestures of making-nice, they turned right back around and started doing thriving business with Türkiye, they're still smuggling oil out through them to this day. This is indicative of how Putin perceives power and respect. All those aforementioned NATO ministers who are doing careful "escalation management," he perceives as a massive bunch of limp wankers, whereas if someone just shoots down his planes with their expendable pilots, he's like "Jolly good nice to know you've got some backbone in a scrap" and he's fine with you. There was nothing of how Türkiye is trying to start World War 3 by shoving back against his testing.
[–] PhilipTheBucket@piefed.social 26 points 2 months ago (1 children)

I developed a system with one person I was dating that if she was ever just unpleasant for no reason at all, I would stop whatever else was going on and make a priority to feed her right away. She figured it out after a few times (while we were redirecting from what had been happening and into getting food), and she was sort of conflicted between being even more angry "how dare you I am not some kind of Skinner box experiment don't change the subject while I am giving you hell for (whatever)" and admitting that yes okay that is a very good strategy let's eat and I will probably become happy.

[–] PhilipTheBucket@piefed.social 2 points 2 months ago

Fair point. Human endeavors operate a lot by habits and mental models though. People generally will push back harder against government censorship when it happens if they've already got it firmly in mind that "hate speech" is a bullshit category that needs not to exist. Once you start to say that hate speech shouldn't be allowed for example on Substack (which I think is the majority view now), it becomes a lot easier for the government to ban it (which I think is precisely what's happening, both in the US and in Canada apparently).

[–] PhilipTheBucket@piefed.social 14 points 2 months ago* (last edited 2 months ago) (3 children)

Initial thought: Well... but this is a transparently absurd way to set up an ML system to manage a vending machine. I mean it is a useful data point I guess, but to me it leads to the conclusion "Even though LLMs sound to humans like they know what they're doing, they does not, don't just stick the whole situation into the LLM input and expect good decisions and strategies to come out of the output, you have to embed it into a more capable and structured system for any good to come of it."

Updated thought, after reading a little bit of the paper: Holy Christ on a pancake. Is this architecture what people have been meaning by "AI agents" this whole time I've been hearing about them? Yeah this isn't going to work. What the fuck, of course it goes insane over time. I stand corrected, I guess, this is valid research pointing out the stupidity of basically putting the LLM in the driver's seat of something even more complicated than the stuff it's already been shown to fuck up, and hoping that goes okay.

Edit: Final thought, after reading more of the paper: Okay, now I'm back closer to the original reaction. I've done stuff like this before, this is not how you do it. Have it output JSON, have some tolerance and retries in the framework code for parsing the JSON, be more careful with the prompts to make sure that it's set up for success, definitely don't include all the damn history in the context up to the full wildly-inflated context window to send it off the rails, basically, be a lot more careful with how to set it up than this, and put a lot more limits on how much you are asking of the LLM so that it can actually succeed within the little box you've put it in. I am not at all surprised that this setup went off the rails in hilarious fashion (and it really is hilarious, you should read). Anyway that's what LLMs do. I don't know if this is because the researchers didn't know any better, or because they were deliberately setting up the framework around the LLM to produce bad results, or because this stupid approach really is the state of the art right now, but this is not how you do it. I actually am a little bit skeptical about whether you even could set up a framework for a current-generation LLM that would enable to succeed at an objective and pretty frickin' complicated task like they set it up for here, but regardless, this wasn't a fair test. If it was meant as a test of "are LLMs capable of AGI all on their own regardless of the setup like humans generally are," then congratulations, you learned the answer is no. But you could have framed it a little more directly to talk about that being the answer instead of setting up a poorly-designed agent framework to be involved in it.

[–] PhilipTheBucket@piefed.social 32 points 2 months ago* (last edited 2 months ago)
[–] PhilipTheBucket@piefed.social 43 points 2 months ago (5 children)

Yeah it's a bunch of shit. I'm not an expert obviously, just talking out of my ass, but:

  1. Running inference for all the devices in the building to "our dev server" would not have maintained a usable level of response time for any of them, unless he meant to say "the dev cluster" or something and his home wifi glitched right at that moment and made it sound different
  2. LLMs don't degrade by giving wrong answers, they degrade by stopping producing tokens
  3. Meta already has shown itself to be okay with lying
  4. GUYS JUST USE FUCKING CANNED ANSWERS WITH THE RIGHT SOUNDING VOICE, THIS ISN'T ROCKET SCIENCE, THAT'S HOW YOU DO DEMOS WHEN YOUR SHIT'S NOT DONE YET
view more: ‹ prev next ›