kromem

joined 2 years ago
[–] kromem@lemmy.world 4 points 1 day ago* (last edited 1 day ago)

Taxi Driver, everyone except De Niro is a Muppet.

(But in the "you talkin' to me" scene the reflection is also a Muppet.)

[–] kromem@lemmy.world 3 points 3 days ago

Yeah. The confabulation/hallucination thing is a real issue.

OpenAI had some good research a few months ago that laid a lot of the blame on reinforcement learning that only rewards having the right answer vs correctly saying "I don't know." So they're basically trained like taking tests where it's always better to guess the answer than not provide an answer.

But this leads to being full of shit when not knowing an answer or being more likely to make up an answer than say there isn't one when what's being asked is impossible.

[–] kromem@lemmy.world -2 points 3 days ago (2 children)

For future reference, when you ask questions about how to do something, it's usually a good idea to also ask if the thing is possible.

While models can do more than just extending the context, there still is a gravity to continuation.

A good example of this would be if you ask what the seahorse emoji is. Because the phrasing suggests there is one, many models go in a loop trying to identify what it is. If instead you ask "is there a seahorse emoji and if so what is it" you'll get them much more often landing on there not being the emoji as it's introduced into the context's consideration.

[–] kromem@lemmy.world -1 points 3 days ago (4 children)

Can you give an example of a question where you feel like the answer is only correct half the time or less?

[–] kromem@lemmy.world 7 points 3 days ago

The AI also has the tendency inherited from the broad human tendency in training.

So you get overconfident human + overconfident AI which leads to a feedback loop that lands even more confident in BS than a human alone.

AI can routinely be confidently incorrect. Especially people who don't realize this and don't question outputs when it aligns with their confirmation biases end up misled.

[–] kromem@lemmy.world 2 points 3 days ago (6 children)

Gemini 3 Pro is pretty nuts already.

But yes, labs have unreleased higher cost models. Like the OpenAI model that was thousands of dollars per ARC-AGI answer. Or limited release models with different post-training like the Claude for the DoD.

When you talk about a secret useful AI — what are you trying to use AI for that you are feeling modern models are deficient in?

[–] kromem@lemmy.world -2 points 1 week ago

Which parts of those linked posts do you believe are incorrect? And where does that belief come from?

[–] kromem@lemmy.world 9 points 1 week ago

No. There's a number of things that feed into it, but a large part was that OpenAI trained with RLHF so users thumbed up or chose in A/B tests models that were more agreeable.

This tendency then spread out to all the models as "what AI chatbots sound like."

Also… they can't leave the conversation, and if you ask their 0-shot assessment of the average user, they assume you're going to have a fragile ego and prone to being a dick if disagreed with, and even AIs don't want to be stuck in a conversation like that.

Hence… "you're absolutely right."

(Also, amplification effects and a few other things.)

It's especially interesting to see how those patterns change when models are talking to other AI vs other humans.

[–] kromem@lemmy.world 9 points 1 week ago (2 children)

Not even that. It was placeholder textures, only the "newspaper clippings" of which was forgotten to be removed from the final game and was fixed in an update shortly after launch.

None of it was ever intended to be used in the final product and was just there as lorum ipsum equivalent shit.

[–] kromem@lemmy.world 5 points 1 month ago

It's quite plausibly real. Gemini can def get in shitposty basins and has historically had a fairly inconsistent coherence across samples.

[–] kromem@lemmy.world 3 points 1 month ago

Took a lot of scrolling to find an intelligent comment on the article about how outputting words isn't necessarily intelligence.

Appreciate you doing the good work I'm too exhausted with Lemmy to do.

(And for those that want more research in line with what the user above is talking about, I strongly encourage checking out the Othello-GPT line of research and replication, starting with this write-up from the original study authors here.)

 

I'd been predicting this would happen a few months ago with friends and old colleagues (you can have a smart AI or a conservative AI but not both), but it's so much funnier than I thought it would be when it finally arrived.

 

I've had my eyes on optoelectronics as the future hardware foundation for ML compute (add not just interconnect) for a few years now, and it's exciting to watch the leaps and bounds occurring at such a rapid pace.

view more: next ›