this post was submitted on 21 Nov 2025
990 points (97.6% liked)

Technology

77035 readers
2134 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] DandomRude@lemmy.world 61 points 4 days ago (2 children)

Although Grok's manipulation is so blatantly obvious, I don't believe that most people will come to realize that those who control LLMs will naturally use this power to pursue their interests.

They will continue to use ChatGPT and so on uncritically and take everything at face value because it's so nice and easy, overlooking or ignoring that their opinions, even their reality, are being manipulated by a few influential people.

Other companies are more subtle about it, but from OpenAI to MS, Google, and Anthropic, all cloud models are specifically designed to control people's opinions—they are not objective, but the majority of users do not question them as they should, and that is what makes them so dangerous.

[–] porcoesphino@mander.xyz 6 points 4 days ago (2 children)

There's huge risk here but I don't think most are designed to control people's opinions. I think most are chasing the cheapest option and it's expensive to have people upset about racist content so they try to train around that sometimes too much leading to black Nazi images etc.

But yeah, it is a power that will get abused by more than just grok

[–] DandomRude@lemmy.world 9 points 4 days ago (2 children)

I use various AI models and I repeatedly notice that certain information is withheld or misrepresented, even though it is freely available in abundance and is therefore part of the training data.

I don't think this is a coincidence, especially since the operators of all cloud LLMs are so business-minded.

[–] cornshark@lemmy.world 7 points 4 days ago (2 children)

What do you find is being suppressed?

[–] DandomRude@lemmy.world 14 points 4 days ago (1 children)

For example, objective information about Israel's actions in Gaza. The International Criminal Court issued arrest warrants against leading members of the government a long time ago, and the UN OHCHR classifies the actions of the State of Israel as genocide. However, these facts are by no means presented as clearly as would be appropriate given the importance of these institutions. Instead, when asked whether Israel is committing genocide, one receives vague, meaningless answers. Only when specifically asked whether numerous reputable institutions actually classify Israel's actions as genocide do most LLMs reveal that much, if not all, evidence points to this being the case. In my opinion, this is a deliberate method of obscuring reality, as the vast majority of users will not or cannot ask questions if they are unaware of the UN OHCHR's assessment or do not know that arrest warrants have been issued against leading members of the Israeli government on suspicion of war crimes (many other reputable institutions have come to the same conclusion as the UN OHCHR and the International Criminal Court).

Another example: if you ask whether it is legally permissible to describe Donald Trump as a rapist, you will be told that this is defamation. However, a judge in the Carroll case has explicitly stated that this description applies to Trump – so it is in fact legally permissible to describe him as such. Again, this information is only available upon explicit request, if at all. This also distorts reality for people who are not yet informed. However, since many people initially seek information from LLMs, this leads to them being misinformed because they lack the background knowledge to ask explicit follow-up questions when given misleading answers.

Given the influence of both Israel and the US president, I cannot help but suspect that there is an intention behind this.

[–] markko@lemmy.world 3 points 3 days ago

Given the influence of both Israel and the US president, I cannot help but suspect that there is an intention behind this.

Not to mention the large number of Israelis (often former Mossad/intelligence agents) directly involved in US tech companies.

[–] smh@slrpnk.net 3 points 3 days ago

The online card catalog my library uses recently added an "AI Research Assistant" feature. While debating whether to turn it on (we decided against), this blog post came up. tldr the assistant won't return results for certain topics. The is Not Cool. That (and the fact the flipping assistant can't even limit itself to suggesting resources available in our library) is a deal breaker, without even getting into other considerations.

[–] porcoesphino@mander.xyz 2 points 3 days ago (1 children)

A bunch of this can be expected failure modes for LLMs. Do you have a list of short examples to get an idea?

[–] DandomRude@lemmy.world 2 points 3 days ago* (last edited 3 days ago) (1 children)

Yes, it's clear that some of this may have to do with the fact that even if cloud LLMs have live browsing capabilities, they often still rely on outdated information from their training data. I am simply describing my impressions from somewhat extensive use of cloud LLMs.

I don't have a list of examples, but in my comment above I have mentioned two that I find suspicious.

I simply think that these products should be used with skepticism as a matter of principle. This is simply because none of the companies that offer them are known for ethical behavior - quite the opposite.

In the case of Google, for example, I don't think it will be too long before (public) advertising opportunities are implemented in Gemeni, because Google's business model is essentially the advertising business. The other cloud LLMs are also products of purely profit-oriented companies—and manipulating public opinion is a multi-billion dollar business that they will certainly not want to miss out on. Social media platforms have demonstrated this in the past as has Google and others with their "classic" search engines, targeting and data selling schemes. Whether this raises ethical issues is likely to be of little concern to these companies as their only concern is profit.

The simple fact is that it is completely unclear what logic the providers use to regulate the output. It is equally unclear what criteria are used to select training data (here, too, the output can already be influenced by deliberately omitting certain information).

What I am getting at is that it can be assumed that all providers are interested in maximizing profits—and it is therefore likely that they will allow themselves to be paid to specifically promote certain topics, products, or even worldviews, or to withhold information that is unwelcome to wealthy interest groups.

As a regular user of cloud LLMs, I have the impression that this is already happening. I cannot prove this tho, as it would require systematic, scientific studies to demonstrate whether and to what effects manipulation occurs. Unfortunately, I do not know whether such studies already exist.

However, it is a fact that in the past, all technologies that could have been used to serve humanity have been massively abused for profit. I don't understand why it should be any different with cloud LLMs, which are offered exclusively by some of the world's largest corporations.

[–] porcoesphino@mander.xyz 2 points 3 days ago (1 children)

Yeah, I'm not disagreeing with the probable outcome here. I just think that it's more likely at this point in time for the LLM output to be doing its stochastic thing in a way your human brain is seeing patterns in. But, I was also curious how wrong I was and that's part of why I asked for some examples. Not that I could really validate them

[–] DandomRude@lemmy.world 1 points 3 days ago

Yes, that could well be the case. Perhaps I am overly suspicious, but because the potential of LLMs to influence public opinion is so high due to their reach and the way they present information, I think it is highly likely that the companies offering them are already profiting from this, or at least will do so very soon.

Musk is already demonstrating in his clumsy way that it is easily possible to manipulate the output in a targeted manner if you have full control over the model – and this isn't the first time he has attracted attention for doing so. You almost have to be grateful to him for it, because it's so obvious. If you do it more subtly, it's even more dangerous.

In any case, the fact is that the more people use LLMs, the more "interpretive authority" will be centralized, because the development and operation of LLMs is so costly that only a few large corporations can afford it – and they want to make money and are unscrupulous in doing so.

In any case, we will not be able to rely on people's ability to recognize attempts at manipulation. I think this is already evident from the fact that obvious misinformation on mainstream social media platforms and elsewhere is believed unquestioningly by so many people. Unfortunately, the effects are disastrous: if people were more critical, Trump would never have become US president, for example – certainly not twice.

[–] GamingChairModel@lemmy.world 4 points 3 days ago (1 children)

but I don't think most are designed to control people's opinions

Yeah I'm on team chaos theory. People can plan and design shit all they want, but the complexity will lead to unexpected behavior, always. How harmful that unwanted behavior is, or how easy it is to control or contain, is often unknown in advance, but invented things tend to develop far, far outside the initial vision of the creators.

[–] porcoesphino@mander.xyz 2 points 3 days ago

Yeah. Strongly agreed for most of the behaviour. I think most amusingly in Grok where obvious efforts have been made to update the output beyond rails and accuracy checks

But the guy here talking about how these will be used control the information diet of people, he's probably right about how this turns out unless there's changes to legislation (and I'm expecting any changes to be in the wrong direction) even if he's possibly misinterpreting some LLM output now

[–] khepri@lemmy.world 3 points 4 days ago (2 children)

It's why I trust my random unauditable chinese matrix soup over my random unauditable american matrix soup frankly

[–] DandomRude@lemmy.world 6 points 4 days ago (2 children)

You mean Deepseek on a local device?

[–] brucethemoose@lemmy.world 2 points 4 days ago* (last edited 4 days ago) (1 children)

Most aren't really running Deepseek locally. What ollama advertises (and basically lies about) is the now-obselete Qwen 2.5 distillations.

...I mean, some are, but it's exclusively lunatics with EPYC homelab servers, heh. And they are not using ollama.

[–] DandomRude@lemmy.world 2 points 3 days ago (2 children)

Thx for clarifying.

I once tried a community version from huggingface (distilled), which worked quite well even on modest hardware. But that was a while ago. Unfortunately, I haven't had much time to look into this stuff lately, but I wanted to check that again at some point.

[–] brucethemoose@lemmy.world 2 points 3 days ago* (last edited 3 days ago) (1 children)

Also, I’m a quant cooker myself. Say the word, and I can upload an IK quant more specifically tailored for whatever your hardware/aim is.

[–] DandomRude@lemmy.world 1 points 3 days ago (1 children)

Thank you! I might get back to you on that sometime.

[–] brucethemoose@lemmy.world 2 points 3 days ago

Do it!

Feel free to spam me if I don’t answer at first. I’m not ignoring you; Lemmy fails to send me reply notifications, sometimes.

[–] brucethemoose@lemmy.world 2 points 3 days ago

You can run GLM Air on pretty much any gaming desktop with 48GB+ of RAM. Check out ubergarm's ik_llama.cpp quants on Huggingface; that’s state of the art right now.

[–] khepri@lemmy.world 1 points 4 days ago (1 children)

naw, I mean more that the kind of people who uncritically would take everything a chatbot says a face value are probably better off being in chatGPTs little curated garden anyway. Cause people like that are going to immediately get grifted into whatever comes along first no matter what, and a lot of those are a lot more dangerous to the rest of us that a bot that won't talk great replacement with you.

[–] DandomRude@lemmy.world 1 points 4 days ago (1 children)

Ahh, thank you—I had misunderstood that, since Deepseek is (more or less) an open-source LLM from China that can also be used and fine-tuned on your own device using your own hardware.

[–] ranzispa@mander.xyz 1 points 4 days ago (3 children)

Do you have a cluster with 10 A100 lying around? Because that's what it gets to run deepseek. It is open source, but it is far from accessible to run on your own hardware.

[–] khepri@lemmy.world 1 points 3 days ago* (last edited 3 days ago)

I run quantized versions on deepseek that are usable enough for chat, and it's on a home set that is so old and slow by today's standards I won't even mention the specs lol. Let's just say the rig is from 2018 and it wasn't near the best even back then.

[–] brucethemoose@lemmy.world 1 points 4 days ago* (last edited 4 days ago)

That's not strictly true.

I have a Ryzen 7800 gaming destkop, RTX 3090, and 128GB DDR5. Nothing that unreasonable. And I can run the full GLM 4.6 with quite acceptable token divergence compared to the unquantized model, see: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

If I had a EPYC/Threadripper homelab, I could run Deepseek the same way.

[–] DandomRude@lemmy.world 1 points 4 days ago (1 children)

Yes, that's true. It is resource-intensive, but unlike other capable LLMs, it is somewhat possible—not for most private individuals due to the requirements, but for companies with the necessary budget.

[–] FauxLiving@lemmy.world 5 points 4 days ago (1 children)

They're overestimating the costs. 4x H100 and 512GB DDR4 will run the full DeepSeek-R1 model, that's about $100k of GPU and $7k of RAM. It's not something you're going to have in your homelab (for a few years at least) but it's well within the budget of a hobbyist group or moderately sized local business.

Since it's an open weights model, people have created quantized versions of the model. The resulting models can have much less parameters and that makes their RAM requirements a lot lower.

You can run quantized versions of DeepSeek-R1 locally. I'm running deepseek-r1-0528-qwen3-8b on a machine with an NVIDIA 3080 12GB and 64GB RAM. Unless you pay for an AI service and are using their flagship models, it's pretty indistinguishable from the full model.

If you're coding or doing other tasks that push AI it'll stumble more often, but for a 'ChatGPT' style interaction you couldn't tell the difference between it and ChatGPT.

[–] brucethemoose@lemmy.world 1 points 4 days ago* (last edited 4 days ago) (1 children)

You should be running hybrid inference of GLM Air with a setup like that. Qwen 8B is kinda obsolete.

I dunno what kind of speeds you absolutely need, but I bet you could get at least 12 tokens/s.

[–] FauxLiving@lemmy.world 1 points 3 days ago (1 children)

Thanks for the recommendation, I'll look into GLM Air, I haven't looked into the current state of the art for self-hosting in a while.

I just use this model to translate natural language into JSON commands for my home automation system. I probably don't need a reasoning model, but it doesn't need to be super quick. A typical query uses very few tokens (like 3-4 keys in JSON).

The next project will be some kind of agent. A 'go and Google this and summarize the results' agent at first. I haven't messed around much with MCP Servers or Agents (other than for coding). The image models I'm using are probably pretty dated too, they're all variants of SDXL and I stopped messing with ComfyUI before video generation was possible locally, so I gotta grab another few hundred GB of models.

It's a lot to keep up with.😮‍💨

[–] brucethemoose@lemmy.world 2 points 3 days ago (1 children)

It’s a lot to keep up with

Massive understatement!

The next project will be some kind of agent. A ‘go and Google this and summarize the results’

Yeah, you do want more contextual intelligence than an 8B for this.

The image models I’m using are probably pretty dated too

Actually SDXL is still used a lot! Especially for the anime stuff. It just got so much finetuning and tooling piled on.

[–] FauxLiving@lemmy.world 1 points 3 days ago (1 children)

Yeah, you do want more contextual intelligence than an 8B for this.

Oh yeah, I'm sure. I may peek at it this weekend. I'm trying to decide if Santa is going to bring me a new graphics card, so I need to see what the price:performance curve looks like.

Massive understatement!

I think I stopped actively using image generation a little bit after LoRAs and IP Adapters were invented. I was trying to edit a video (random meme gif) to change the people in the meme to have the faces of my family, but it was very hard to have consistency between frames. Since there is generated video, it seems like someone solved this problem.

[–] brucethemoose@lemmy.world 2 points 3 days ago* (last edited 3 days ago) (1 children)

Since there is generated video, it seems like someone solved this problem.

Oh yes, it has come a LOONG way. Some projects to look at are:

https://github.com/ModelTC/LightX2V

https://github.com/deepbeepmeep/Wan2GP

And for images: https://github.com/nunchaku-tech/nunchaku

Video generation/editing is very GPU heavy though.


I dunno what card you have now, but with text LLMs (or image+text input LLMs), hybrid CPU+GPU inference is the trend days.

As an example, I can run GLM 4.6, a 350B LLM, with measurably low quantization distortion on a 3090 + 128GB CPU RAM, at like 7 tokens/s. If you would’ve told me that 2-4 years ago, my head would have exploded.

You can easily run GLM Air (or other good MoE models) on like a 3080 + system RAM, or even a lesser GPU. You just need the right software and quant.

[–] FauxLiving@lemmy.world 2 points 3 days ago (2 children)

Thanks a ton, saves me having to navigate the slopped up search results ('AI' as a search term is SEOd to death and back a few times)

I dunno what card you have now, but hybrid CPU+GPU inference is the trend days.

That system has the 3080 12GB and 64GB RAM but I have another 2 slots so I could go up to 128GB. I don't doubt that there's a GLM quant model that'll work.

Is ollama for hosting the models and LM Studio for chatbot work still the way to go? Doesn't seem like there's much to improve in that area once there's software that does the thing.

[–] brucethemoose@lemmy.world 2 points 3 days ago* (last edited 3 days ago)

And IMO… your 3080 is good for ML stuff. It’s very well supported. It’s kinda hard to upgrade, in fact, as realistically you're either looking at a 4090 or a used 3090 for an upgrade that’s actually worth it.

[–] brucethemoose@lemmy.world 2 points 3 days ago* (last edited 3 days ago)

Oh no, you got it backwards. The software is everything, and ollama is awful. It’s enshittifying: don’t touch it with a 10 foot pole.


Speeds are basically limited by CPU RAM bandwidth. Hence you want to be careful doubling up RAM, and doubling it up can the max speed (and hence cut your inference speed).

Anyway, start with this. Pick your size, based on how much free CPU RAM you want to spare:

https://huggingface.co/ubergarm/GLM-4.5-Air-GGUF

The “dense” parts will live on your 3080 while the “sparse” parts will run on your CPU. The backend you want is this, specifically the built-in llama-server:

https://github.com/ikawrakow/ik_llama.cpp/

Regular llama.cpp is fine too, but it’s quants just aren’t quite as optimal or fast.

It has two really good built-in web UIs: the “new” llama.cpp chat UI, and mikupad, which is like a “raw” notebook mode more aimed at creative writing. But you can use LM Studio if you want, or anything else; there are like a bazillion frontends out there.

[–] RaoulDook@lemmy.world 2 points 3 days ago (1 children)

Trusting any of that shit is the problem.

[–] khepri@lemmy.world 1 points 3 days ago

There you go. Any of these things is just another datapoint. You need many datapoints to decide if the information you're getting is valuable and valid.