this post was submitted on 04 Jul 2025
116 points (99.2% liked)

Programmer Humor

24736 readers
1523 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[โ€“] Azzu@lemmy.dbzer0.com 3 points 12 hours ago (1 children)

Because this is not log or debug data as OP said. In any case, what do you think would happen with this data? It will be analyzed by some sort of tool because no one could manually look at this much text data. In text, this can be like 1MB of data per second. So in a normal eye tracking session, probably hundreds of MB. The problem isn't the storage space, but the time it will take to read that in and analyze it each time, forcing you to wait for processing or use lots of memory while reading it. And anyway, in most languages, it's actually much easier to store the number values directly (in 8 bytes not the 30something this text representation uses) than to convert them to JSON, all languages have some built-in way to do that. And even if not, sqlite is piss-easy and does everything for you, being as simple as JSON.

There is just no reason to do it like that unless you just don't think about what you're doing or have no clue.

[โ€“] nous@programming.dev 1 points 5 hours ago

export tracking data to analyze later on

That is essentially log data or essentially equivalent. Log data does not have to be human readable, it is just a series of events that happen over time. Most log data, even what you would think of as traditional messages from a program, is not parsed by humans manually but analyzed by code later on. It is really not that hard to slow to process log data line by line. I have done this with TB of data before which does require a lot more effort to do. A simple file like this would take seconds to process at most, even if you were not very efficient about it. I also never said it needed to be stored as text, just a simple file is enough - no need for a full database. That file could be binary if you really need it to be but text serialization would also be good enough. Most of the web world is processed via text serialization.

The biggest problem with yaml like in OP is the need to decode the whole file at once since it is a single list. Line by line processing would be a lot easier to work with. But even then if it is only a few 100 MBs loading it all in memory once and analyzing it all in memory would not take long at all - it just does not scale very well.