this post was submitted on 19 Nov 2025
69 points (96.0% liked)

Data Hoarder

747 readers
95 users here now

Keep it about datahoarding.

Rules

founded 2 years ago
MODERATORS
 

I've been running OCR on the recent house epstein email dump. Making this available now that its close to finishing (20k/ 23k emails processed).

Processing script available here: https://codeberg.org/sillyhonu/Image_OCR_Processing_Epstein

I also put an analysis script in there if you want to use drive/ colab.

Currently finished files are available here:

https://files.catbox.moe/xrgts0.sqlite

you are viewing a single comment's thread
view the rest of the comments
[–] TropicalDingdong@lemmy.world 11 points 17 hours ago

I would caution over-interpretation of this. The only direct mention of bubba getting a BJ is that one email thread.

And a few more things. One, the email dump has a huge number of "digest" type emails. Like summaries of forum conversations. 10035, and 10053, are from a forum they seemed to be tracking, so Epstein didn't say those things, but some commenter in the forum did.

Also, they were regularly being forwarded articles. Like a lot of articles. From a lot of people. And it often had to do with them. So in some ways, this contaminates the e-mails, because it creates a set of names, dates, and locations, which, was just someone sending an article about Epstein to Epstein.