this post was submitted on 19 Nov 2025
72 points (96.2% liked)

Data Hoarder

747 readers
97 users here now

Keep it about datahoarding.

Rules

founded 2 years ago
MODERATORS
 

I've been running OCR on the recent house epstein email dump. Making this available now that its close to finishing (20k/ 23k emails processed).

Processing script available here: https://codeberg.org/sillyhonu/Image_OCR_Processing_Epstein

I also put an analysis script in there if you want to use drive/ colab.

Currently finished files are available here:

https://files.catbox.moe/xrgts0.sqlite

you are viewing a single comment's thread
view the rest of the comments
[–] TropicalDingdong@lemmy.world 1 points 14 hours ago

I doubt you actually do, and I doubt most people do. The vast vast majority of email is sent and never even read.

You are only thinking about email used as direct correspondence,.but how many random mass mailer emails have landed in your in boxes today? 10s? Hundreds?

I have another figure I can send you, but let me get some coffee in me. It's a frequency analysis in time.