this post was submitted on 03 Apr 2025
395 points (99.5% liked)
Programmer Humor
22251 readers
343 users here now
Welcome to Programmer Humor!
This is a place where you can post jokes, memes, humor, etc. related to programming!
For sharing awful code theres also Programming Horror.
Rules
- Keep content in english
- No advertisements
- Posts must be related to programming or programmer topics
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
With ASCII æs the åriginal sin. Can't even spell my name with that joke of an encoding >:(
It's a "joke" because it comes from an era when memory was at a premium and, for better or worse, the English-speaking world was at the forefront of technology.
The fact that English has an alphabet of length just shy of a power of two probably helped spur on technological advancement that would have otherwise quickly been bogged down in trying to represent all the necessary glyphs and squeeze them into available RAM.
... Or ROM for that matter. In the ROM, you'd need bit patterns or vector lists that describe each and every character and that's necessarily an order of magnitude bigger than what's needed to store a value per glyph. ROM is an order of magnitude cheaper, but those two orders of magnitude basically cancel out and you have a ROM that costs as much to make as the RAM.
And when you look at ASCII's contemporary EBCDIC, you'll realise what a marvel ASCII is by comparison. Things could have been much, much worse.
It's a joke because it includes useless letters nobody needs, like that weird o with the leg, and a rich set of field and record separating characters that are almost completely forgotten, etc, but not normal letters used in everyday language >:(
Can you elaborate? Do you mean
Q
orp
?Lol
Q. P is a common character across languages. But Q is mostly unused, at least outside the romance languages who appear to spell K that way. But that can be solved by letting the characters have the same code point, and rendering it as K in most regions, and Q in France. I can't imagine any problems arising from that. :)
While we're at it, I have some other suggestions...
Look into the Shavian alphabet
Haha, nicely done. I had to work harder and harder to read it.
Jess. Ai'm still lukking får the ekvivalent åv /r/JuropijenSpelling her ån lemmi. Fæntæstikk søbreddit vitsj æbsolutli nids lemmi representeysjen.
If that's a joke, it's a good one. Otherwise, well, there are a lot of "this letter isn't needed let's throw it away," in most cases it will not work as good as you think.
Yes, I am joking. We probably could do something like the old iso-646 or whatever it was that swapped letters depending on locale (or equivalent), but it's not something we want to return to.
It's also not something we're entirely free of: Even though it's mostly gone, apparently Bulgarian locales do something interesting with Cyrillic characters. cf https://tonsky.me/blog/unicode/
Damn, thanks for that link; earlier today I was telling a non techy friend about Unicode quirks earlier and I could vaguely remember that post, but not well enough to remember how to find it. I didn't try very hard because it wasn't a big deal, so the serendipity of finding it via your comment was neat.
That is quite a unique quip. I love the idea of geo-based rendering, every application that renders text needs location access to be strictly correct :D.
I'd go further with the codepoint reduction, and delete
w
(can useuu
) instead, and deletek
(hardc
can take its place)To unjerk, as it were, it was a thing. So on old systems they'd do stuff like represent æøå with the same code points as
{|}
. Curly brace languages must have looked pretty weird back then:)It still is a thing in some fonts: https://blog.miguelgrinberg.com/post/font-ligatures-for-your-code-editor-and-terminal
Took me a while to work out what they were called. Font rendering is hard :(