this post was submitted on 21 Nov 2025

1133 points (99.6% liked)

Lemmy Shitpost

40071 readers

3152 users here now

Welcome to Lemmy Shitpost. Here you can shitpost to your hearts content.

Anything and everything goes. Memes, Jokes, Vents and Banter. Though we still have to comply with lemmy.world instance rules. So behave!

Rules:

1. Be Respectful

Refrain from using harmful language pertaining to a protected characteristic: e.g. race, gender, sexuality, disability or religion.

Refrain from being argumentative when responding or commenting to posts/replies. Personal attacks are not welcome here.

...

2. No Illegal Content

Content that violates the law. Any post/comment found to be in breach of common law will be removed and given to the authorities if required.

That means:

-No promoting violence/threats against any individuals

-No CSA content or Revenge Porn

-No sharing private/personal information (Doxxing)

...

3. No Spam

Posting the same post, no matter the intent is against the rules.

-If you have posted content, please refrain from re-posting said content within this community.

-Do not spam posts with intent to harass, annoy, bully, advertise, scam or harm this community.

-No posting Scams/Advertisements/Phishing Links/IP Grabbers

-No Bots, Bots will be banned from the community.

...

4. No Porn/Explicit

Content

-Do not post explicit content. Lemmy.World is not the instance for NSFW content.

-Do not post Gore or Shock Content.

...

5. No Enciting Harassment,

Brigading, Doxxing or Witch Hunts

-Do not Brigade other Communities

-No calls to action against other communities/users within Lemmy or outside of Lemmy.

-No Witch Hunts against users/communities.

-No content that harasses members within or outside of the community.

...

6. NSFW should be behind NSFW tags.

-Content that is NSFW should be behind NSFW tags.

-Content that might be distressing should be kept behind NSFW tags.

...

If you see content that is a breach of the rules, please flag and report the comment and a moderator will take action where they can.

Also check out:

Partnered Communities:

1.Memes

2.Lemmy Review

3.Mildly Infuriating

4.Lemmy Be Wholesome

5.No Stupid Questions

10.LinuxMemes (Linux themed memes)

Reach out to

All communities included on the sidebar are to be made in compliance with the instance rules. Striker

founded 3 years ago

MODERATORS

LillianVS@lemmy.world

WiildFiire@lemmy.world

Decoy321@lemmy.world

The_Picard_Maneuver@startrek.website

FlyingSquid@lemmy.world

The_Picard_Maneuver@lemmy.world

1133

Hi, Jeffrey! (discuss.online)

submitted 6 months ago by bytesonbike@discuss.online to c/lemmyshitpost@lemmy.world

95 comments fedilink hide all child comments

https://jmail.world/

you are viewing a single comment's thread
view the rest of the comments

[–] brucethemoose@lemmy.world 49 points 6 months ago* (last edited 6 months ago) (4 children)

There's a 'meme' trend of local ML tinkerers messing with the Epstein files as a dataset: https://huggingface.co/datasets/tensonaut/EPSTEIN_FILES_20K/

See: text embeddings https://huggingface.co/datasets/svetfm/epstein-files-nov11-25-house-post-ocr-embeddings

Edit: Now I’m pondering making an “EpsteinGPT” finetune myself. Maybe like a 4B-14B model for the sole purpose of Epstein RAG? Or a 32B responding in the style of the Epstein email text, just because.

[–] khepri@lemmy.world 29 points 6 months ago (1 children)

Just imagine having to explain being in possession of a handmade "EpsteinGPT" to someone 🤣🤣

[–] brucethemoose@lemmy.world 5 points 6 months ago* (last edited 6 months ago) (2 children)

Meme finetunes are nothing new.

As an example, there are DPO datasets with positive/negative examples intended to train LLMs to respond politely and helpfully (as opposed to the negative response). There are some that include toxic comments plucked from the web as negative examples.

And the immediate community thought was "...What if I reversed them?"

[–] khepri@lemmy.world 6 points 6 months ago* (last edited 6 months ago) (1 children)

haha just imaging people showing off their collections, "here's my Mr. Rogers chatbot, and Thomas Jefferson, and even Luffy from One Piece! And uh...oh yeah over here we have EpsteinGPT for when I, I mean for if, um...its for lulz ok?! Don't look at me like that, where are you going?!"

[–] brucethemoose@lemmy.world 11 points 6 months ago* (last edited 6 months ago) (1 children)

It's literally "this one is my fursona. This one won't refuse BDSM, but its not as eloquent. Oh, this one is lobotimized but really creative." I kid you not. Here is an example, and note that is one of 115 uploads from one account:

https://huggingface.co/Mawdistical/RAWMAW-70B?not-for-all-audiences=true

And I love that madness. It feels like the old internet. In fact, furries and horny roleplayers have made some good code contributions to the space.

Early on, there were a few 'character' finetunes or more generic ones like 'talk like a pirate' or 'talk only in emojiis.' But as local models got more advanced, they got so good at adopting personas that the finetuning focused more on writing 'style' and storytelling than emulating specific characters. For example, one trained specifically to stick to the role of a dungeonmaster: https://huggingface.co/LatitudeGames/Nova-70B-Llama-3.3

Or this one, where you can look at the datasets and see the anime 'style' they're trying to massage in: https://huggingface.co/zerofata/GLM-4.5-Iceblink-106B-A12B

[–] tomiant@piefed.social 1 points 6 months ago

Hey this is really cool and thanks for sharing it.

[–] tomiant@piefed.social 1 points 6 months ago

My immediate thought was "...What if I reversed them?"

[–] bytesonbike@discuss.online 15 points 6 months ago (1 children)

We need more of this.

Not data hidden in some file.

[–] Denjin@feddit.uk 7 points 6 months ago (2 children)

How did people parse data sets before we had LLMs to do it for us?

[–] brucethemoose@lemmy.world 4 points 6 months ago

...The same way Google Search has forever?

Ranking, reranking, oldschool RAG.

[–] wookiepedia@lemmy.world 2 points 6 months ago

I would say grep, sed, awk and the like.

[–] lemming741@lemmy.world 5 points 6 months ago (1 children)

Instead of em dashes, it's full of extra spaces before and after each period and extra period.

[–] brucethemoose@lemmy.world 2 points 6 months ago* (last edited 6 months ago) (1 children)

I dunno what the 'writing style' would end up as. The bulk of the text seems to be formatted like this:

...
10. Is Epstein cooperating with federal suit against Bear Stearns hedge fund managers Ralph Cioffi
and Matthew Tannin? Will he testify in their cases?

11. Mr Epstein was deposed on this week, on Thursday. Is it true that he answered almost every
question by invoking his Fifth Amendment rights?

12. Defense attorney Brad Evans has filed a motion to freeze Mr Epstein’s assets. Has Mr.
Epstein moved his money from the US offshore or abroad, or does he intend to, in order to
protect his assets from possible damage claims?

13. What did Mr. Epstein do during his work release program while serving time. Reports have
said he engaged in “scientific research.” If so, what was he researching?
...

Response

"That's because it isn't, and everyone here
(apparently save one) is rational and objective enough
to understand that. Physical phenomena, and
phenomena in general, are

ultimately perceptual in nature and subject to
observational replication - that's why they call
physics an empirical science. But consciousness is
not.

Consciousness cannot be objectively, replicably
observed. Its putative physical correlates, including
...

Bill Clinton identified in lawsuit against his former friend and
pedophile Jeffrey Epstein who had 'regular' orgies at his Caribbean
compound that the former president visited multiple times

e The former president was friends with Jeffrey Epstein, a financier who was arrested
in 2008 for soliciting underage prostitutes

e Anew lawsuit has revealed how Clinton took multiple trips to Epstein's private island
where he 'kept young women as sex slaves'

e Clinton was also apparently friends with a woman who collected naked pictures of
underage girls for Epstein to choose from

e He hasn't cut ties with that woman, however, and invited her to Chelsea's wedding

e Comes as friends now fear that if Hillary Clinton runs for president in 2016, all of
their family's old scandals will be brought to the forefront

e Epstein has a host of famous friends including Prince Andrew who stayed at his New

York mansion AFTER his arrest
By Daily Mail Reporter
Published: 09:06 EST, 19 March 2014 | Updated: 21:10 EST, 5 January 2015

I'd have to generate prompt/response wrappers too. But it would definitely bring up Trump and Clinton randomly, heh.

...There are automated metrics to rank English text by reading level, 'quality' and such. I guess it could be filtered to most 'interesting' emails and reformatted.

[–] lemming741@lemmy.world 3 points 6 months ago

Ah I misinterpreted as the most recent email dump. Like you could email back and forth with an avatar of jee

[–] voluble@lemmy.ca 2 points 6 months ago (1 children)

This is interesting. Do you have any thoughts on why someone would want to utilize the epstein data for ML? Like, what's the point, in your opinion? Just lulz? Or, something else?

[–] brucethemoose@lemmy.world 3 points 6 months ago* (last edited 6 months ago)

Lulz.

It’s an interesting coding exercise, though. Trying to (for example) OCR all the documents, or generate a relations graph between the documents or concepts, is a great into to language modeling (which is not prompt engineering, like most seem to think).

If you’re like a reporter or something, it’s also the obvious way to comb through the documents looking for clues to actually make headlines. I dunno what techniques they use at big outlets, though.