this post was submitted on 04 Apr 2025
20 points (100.0% liked)

Programming

19432 readers
373 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 2 years ago
MODERATORS
 

I'm doing some Galois Field / Cyclic Redundancy Check research for fun and I've come across an intriguing pattern that I need a data structure for.

Across the 64-bit (or even 128-bit or larger) spaces, I've discovered an interesting pattern relating to hamming distances that I'd like a data structure to represent.

I'm going to need something on the order of ~billions of intervals each having somewhere between 1 item to ~1 billion per interval. And I'd like to quickly (O(1) or O(lg(n))) determine if other intervals intersect.


For 32-bit space I can simply make a 512MB Bitmask lol and then AND/OR the two Bitmask. Easy

But for 64-bit space I'm stuck and a bit ignorant to various data structures. I'm wondering if someone out there has a good data structure for me to use?

I've read over Interval Trees on Wikipedia. I'm also considering binary decision diagram over the 64-bits actually. Finally I'm thinking of some kind of 1-dimension octtree like datastructure (is that just a binary tree?? Lol. But BVH trees in 3d space seems similar to my problem it's just I need it optimized down to 1 dimension rather than 3.) Anyone else have any other ideas or cool data structures that might work?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 5 points 1 week ago (2 children)

Ph-trees can do range and closest queries across N dimensions very quickly. I have not used it for 1 dimension, but I'd imagine it would work fine.

https://github.com/tzaeschke/phtree

[โ€“] [email protected] 2 points 5 days ago* (last edited 5 days ago)

Great answer!!

After thinking about all this for a while, I've gone with the basic binary tree (leaning towards AVL tree as I expect my use case to be read heavy).

In my use case, multiple 'intervals' can merge together without major penalty (and should be merged together). It looks like a lot of these interval trees (including ph trees) are best when the intervals need to be kept separate.

There is a part of my algorithm where ph trees might be useful though. I'll have to give it some though.


I'm kind of shocked that a basic binary tree ended up being so usable. Its a classic for a reason, lol. I guess I saw the intervals and got confused and overcomplicated things....

load more comments (1 replies)