this post was submitted on 22 Dec 2025
166 points (97.7% liked)

Technology

6385 readers
282 users here now

Which posts fit here?

Any news that are at least tangentially connected to the technology, social media platforms, informational technologies or tech policy.


Post guidelines

[Opinion] prefixOpinion (op-ed) articles must use [Opinion] prefix before the title.


Rules

1. English onlyTitle and associated content has to be in English.
2. Use original linkPost URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communicationAll communication has to be respectful of differing opinions, viewpoints, and experiences.
4. InclusivityEveryone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacksAny kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangentsStay on topic. Keep it relevant.
7. Instance rules may applyIf something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.


Companion communities

!globalnews@lemmy.zip
!interestingshare@lemmy.zip


Icon attribution | Banner attribution


If someone is interested in moderating this community, message @brikox@lemmy.zip.

founded 2 years ago
MODERATORS
 

Anna’s Archive’s idealism doesn’t quite survive its own blog post

top 16 comments
sorted by: hot top controversial new old
[–] 6nk06@sh.itjust.works 48 points 2 months ago (1 children)

I love how titles change. Compare it with "Hacktivist named Zuck scrapes Anna's archive to steal more billions."

[–] purplemonkeymad@programming.dev 5 points 2 months ago

I know for sure all the ai companies are on that torrent. Expect new music models.

[–] octoshrimpy@sh.itjust.works 17 points 2 months ago* (last edited 2 months ago) (2 children)

~~186 million. Not 86 million. There's 100 million difference there. ~~

https://annas-archive.li/blog/backing-up-spotify.html

This article ~~is trash and ~~did not bother reading the source blog about "why didn't they just grab all songs then?"

Edit: octo can't read on Mondays apparently.

[–] TheTechnician27@lemmy.world 12 points 2 months ago* (last edited 2 months ago)

did not bother reading the source blog

That's rich given the source blog says "86 million music files" the paragraph after saying 186 million unique ISRCs. You apparently read less of it than The Register's writer did.

[–] BrikoX@lemmy.zip 12 points 2 months ago

You are confusing "186 million unique ISRCs" with "86 million music files".

[–] ieatpwns@lemmy.world 11 points 2 months ago

I don’t think there’s any reason to trust Spotify saying they’ve figured out how ppl were scraping. As far as I’m aware I don’t think AA mentioned how they were scraping

[–] morto@piefed.social 9 points 2 months ago (1 children)

I'd love to make a similar thing with google maps and store their high resolution remote sensing images in a georeferenced format, but I don't have enough storage for even 1% of that.

[–] kugel7c@feddit.org 3 points 2 months ago (1 children)

I think Anna's archive actually has a bounty on that or at least a very related task.

So if you've found a way to economically scrape the street view 360° pictures definitely let them know. As far as I understand if you can prove you have a working method for scraping they will do what they can in terms of infrastructure and such to get it done.

The gitlab on which this is explained is down for me atm.

[–] morto@piefed.social 1 points 2 months ago

Thanks for letting me know. I will definitely check it out in my free time after the holidays!

[–] scintilla@crust.piefed.social 6 points 2 months ago (1 children)

Pirating music is genuinely just not worth the effort so I honestly belive them. Probably didn't scrape all of them simply because that's a petabyte of storage or more.

[–] chicken@lemmy.dbzer0.com 3 points 2 months ago (1 children)

with 86 million music files, representing around 99.6% of listens.

I'm not sure exactly how "listens" maps on to total songs, but it sounds like they got almost all of them

[–] scintilla@crust.piefed.social 2 points 2 months ago (1 children)

Over 250 million total so about 1/3.

[–] chicken@lemmy.dbzer0.com 4 points 2 months ago

Oh right. So I guess that means the majority of stuff uploaded to Spotify never really gets listened to.

[–] LiveLM@lemmy.zip 3 points 2 months ago* (last edited 2 months ago)

By not bothering with all the musical chaff in Spotify's catalog, the Anna's Archive team is apparently content to let those less popular songs languish despite their claim to want to avoid focusing on just the most popular artists.

Okay Mr The Register writer, tell me how you would handle creating and wrangling a petabyte sized torrent

[–] Shamber@lemmy.world 3 points 2 months ago

So, they are complaining about how Spotify is under paying and robbing the artist, and they thought well, the solution would be to dumb all their music online for free?

[–] hal_5700X@sh.itjust.works 3 points 2 months ago

Yes, "preserve culture" 😉