this post was submitted on 20 Apr 2026
520 points (99.1% liked)
interestingasfuck
9132 readers
637 users here now
For exceptionally interesting content
Rules:
- Posts must be interesting
- Posts must be based in reality
- No hateful content
- No harmful content
- Beauty is in the eye of the beholder
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
How do emoji use more data? They're one, maybe two unicode characters?
And thus more data
Than an entire word?
Take "cactus" for example. Each letter in the word "cactus" is one unicode character, for a total of six. ๐ต is one unicode character, U+1F335.
Unicode characters are 4 bytes long, so "cactus" takes 24 bytes to transmit, where "๐ต" takes 4. Unless something something UTF_8?
You're close, Unicode characters don't imply a number of bytes, it's how they're encoded that does (utf-8 most commonly). Utf-8 can be as little as one byte or as many as four, depending on the specific character. I don't know about emojis but I imagine they're in the four bytes section. Whereas "asdf" is also four bytes in utf-8.
So I just looked it up, the UTF-8 encoding for the cactus emoji is 4 bytes long: 0xF0 0x9F 0x8C 0xB5
Where the Latin alphabet is in the 1-byte region.
So it takes 6 bytes to transmit "cactus" in UTF-8, and only 4 to transmit โ๐ตโ. So any emoji that replaces 5 or more letters is more efficient. ๐ breaks even with "dick" or "cock", more efficient than "penis", more than twice as compact as "eggplant" or "aubergine".
But that's not what people are doing
They always use a word and an emoji
Yes, to be clear I meant the example I gave where the word was replaced with the emoji was compression, not where they give the word and its emoji. That's as long-handed as possible.