this post was submitted on 12 Jun 2024
14 points (100.0% liked)

Programmer Humor

35334 readers
1 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

founded 6 years ago
MODERATORS
 
all 39 comments
sorted by: hot top controversial new old
[–] impure9435@kbin.run 4 points 2 years ago (2 children)

The thing that I find the most funny about this post, is the fact that you call this Italian

[–] lseif@sopuli.xyz 2 points 2 years ago (2 children)

how am i supposed to know how italians speak. i've never seen one

[–] thesporkeffect@lemmy.world 2 points 2 years ago (2 children)

They're not real, but they can hurt you.

[–] lseif@sopuli.xyz 1 points 2 years ago

like reverse vampires ?

[–] pewpew@feddit.it 0 points 2 years ago (1 children)
[–] lars@lemmy.sdf.org 1 points 2 years ago

That’s right! None of us knows how Italians can speak in the dark 🤌

[–] jballs@sh.itjust.works 1 points 2 years ago (1 children)

From my experience, they speak mostly with their hands

[–] theterrasque@infosec.pub 3 points 2 years ago (1 children)

🫰🤙🫵👌✊🫳🫸🤲🤌

[–] stingpie@lemmy.world 3 points 2 years ago (2 children)

This might be happening because of the 'elegant' (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.

[–] PlexSheep@infosec.pub 2 points 2 years ago (1 children)

Do you have a source for that? Seems like an internal detail a corpo wouldn't publish

[–] stingpie@lemmy.world 1 points 2 years ago

Can't find the exact source–I'm on mobile right now–but the code for the gpt-2 encoder uses a utf-8 to unicode look up table to shrink the vocab size. https://github.com/openai/gpt-2/blob/master/src/encoder.py

[–] NeatNit@discuss.tchncs.de 1 points 2 years ago (1 children)

I suppose it's conceivable that there's a bug in converting between different representations of Unicode, but I'm not buying and of this "detected which language is being spoken" nonsense or the use of character sets. It would just use Unicode.

The modulo idea makes absolutely no sense, as LLMs use tokens, not characters, and there's soooooo many tokens. It would make no sense to make those tokens ambiguous.

[–] stingpie@lemmy.world 1 points 2 years ago

I completely agree that it's a stupid way of doing things, but it is how openai reduced the vocab size of gpt-2 & gpt-3. As far as I know–I have only read the comments in the source code– the conversion is done as a preprocessing step. Here's the code to gpt-2: https://github.com/openai/gpt-2/blob/master/src/encoder.py I did apparently make a mistake, as the vocab reduction is done through a lut instead of a simple mod.

[–] Phoenix3875@lemmy.world 2 points 2 years ago (1 children)

Let me simplify it: proceeds to print the same expression

[–] ChanchoManco@lemm.ee 2 points 2 years ago* (last edited 2 years ago) (1 children)

Typical AI behavior

Edit: and then it will gaslight you if you say the answer is the same.

[–] driving_crooner@lemmy.eco.br 1 points 2 years ago

Fucking hate when do that.

You are repeating the same mistake.

I'm sorry for repeating the same mistake, here's a new solution with corrections *proceed to write the exactly thing already told it was wrong*

[–] XEAL@lemm.ee 1 points 2 years ago (1 children)

Ah, I see you're using FartGPT instead of ChatGPT

[–] Blyfh@lemmy.world 1 points 2 years ago

French pronunciation intensifies

[–] ICastFist@programming.dev 1 points 2 years ago

Title mentions speaking italian

Not a single hand gesture anywhere

I've been duped

[–] Annoyed_Crabby@monyet.cc 1 points 2 years ago

That's not italian that's obviously Unown

[–] abrahambelch@programming.dev 1 points 2 years ago (1 children)

Which language uses these signs? It truly looks like some kind of alien language

[–] chapapa@discuss.tchncs.de 2 points 2 years ago* (last edited 2 years ago) (1 children)

Glagolitic script. Oldest known Slavic alphabet according to Wikipedia.

[–] Vitaly@feddit.uk 1 points 2 years ago

It looks so badass, I could have used that script now because im Ukrainian but instead I have cyrillic script which is so boring

[–] unreachable@lemmy.world 1 points 2 years ago
[–] Redex68@lemmy.world 1 points 2 years ago

Damn, wild Glagolitic script found. I didn't even realise it was in the Unicode standard.

[–] iAvicenna@lemmy.world 1 points 2 years ago
[–] RacoonVegetable@reddthat.com 1 points 2 years ago

I felt that when he said *83h400+93)*38hpfhi0

[–] QuazarOmega@lemy.lol 0 points 2 years ago (1 children)

You may not understand, but we do.
Questo segreto rimarrà custodito gelosamente dalla stirpe italica. ◉‿◉

[–] Iheartcheese@lemmy.world 0 points 2 years ago (1 children)
[–] robigan@lemmy.world 0 points 2 years ago (1 children)

How about go die in a hole?

[–] Vitaly@feddit.uk 0 points 2 years ago (1 children)

Kind of looks like the writing system of Georgian language but I'm not sure

[–] Allero@lemmy.today 0 points 2 years ago (1 children)

No, this is Glagolitic script, an alternative to Cyrillic. Mostly used in old Slavic scriptures, was later replaced by Cyrillic and Latin.

Most Slavs themselves don't know how to read this

[–] TwilightKiddy@programming.dev 1 points 2 years ago

It's a dead script that was not that common in the first place, in Kievan Rus' it was even used as a form of encryption in XI—XVI centuries for how little spread it was. It is also very different from modern Cyrillic. So, saying "most Slavs don't know how to read it" is a bit of an understatement. Noone knows how to read it, apart from some linguists and overzealous Witcher fans.