What is Nostr?
mgorny-nyan (he) :autism:🙀🚂🐧 /
npub1xcf…2zan
2024-06-25 15:47:27

mgorny-nyan (he) :autism:🙀🚂🐧 on Nostr: Let's talk about #diacritics in #Polish. There are 9 letters with diacritics. These ...

Let's talk about #diacritics in #Polish.

There are 9 letters with diacritics. These are:

ą, ć, ę, ł, ń, ó, ś, ź, ż.

They are generally transliterated into ASCII by losing the diacritics, and replacing with the resulting basic Latin letters:

a, c, e, l, n, o, s, z, z.

In lexical ordering, they are placed in this order after the respective basic Latin, i.e. the Polish alphabet is:

a, ą, b, c, ć, …, z, ź, ż.

As you can see, there are actually two diacritical variants of "z". In the commonly used "programmer's" Polish keyboard layout, all characters except for "ź" are obtained by combining AltGr with the Latin letter — "ź" is moved to the next key, "x". That is:

AltGr + a → ą
AltGr + c → ć

AltGr + x → ź
AltGr + z → ż

Now, many on-screen keyboards actually don't follow that design and place both "ź" and "ż" under "z". I personally find this annoying, and for my phone, I've specifically chosen a keyboard that placed "ź" under "x".

Another curious problem is that losing the diacritics sometimes causes collision with another word. Probably the most famous example is the sentence:

"Łaski mi nie robisz" → "It's not like you are doing me any favors"
"Laski mi nie robisz" → "You're not making me a blowjob"

Some people have been trying to do better than plain ASCII transliteration. I've mentioned before that some "Western European" layouts include "ó", so you may use that. Back in the day, a certain Polish version of Heroes of Might and Magic II (I think) used "t" in place of "ł". Then, you could try combining some characters on a typewrite (here I recommend playing with Marcin Wichary (npub1478…y4ag) 's Typewriter simulator: https://shifthappens.site/typewriter/).

Oh, and there's also the possibility of writing "ƶ" instead of "ż" (especially in uppercase, where the dot might make the letter undesirably tall), though some people apparently write "ƶ" instead of "z" instead (to make it distinctive from "2").

Phonetics next. I'm not going to go over the whole Polish phonetics, because I have neither the space nor the skills to do that. But let's go over the diacritical letters and consider their phonetic equivalents.

ą, ę → nasal o, e

They are phonetically similar to "om", "on" and "em", "en" respectively, sometimes leading to spelling errors. You could have words such as "romb" /rɔ̃mp/ (= rhombus) and "rąb" /rɔ̃mp/ (= chop!).

ć, ń, ś, ź → apparently these are palatal versions of c, n, s, z, whatever that means (I don't think these sounds are possible in English)

They are phonetically similar to some pronunciations of "cI", "ni", "si", "zi" (but not all, enjoy good spelling system). For example, the "ci" in "cień" /ʨ̑ɛ̇̃ɲ/ [= shadow] is pronounced the same as "ć" in "ćma" /ʨ̑ma/ [= moth], but "ci" in "cis" /ʨ̑is/ [= yew] is more like "ć" + explicit "i", and then you have "cirrus" with hard "c" and "i" /ˈt͡sir.rus/.

ł → well, it's the Polish /w/ ("w" is the Polish /v/)

Sometimes "u" is actually read as "ł", e.g. in the onomatopoeia "miau" (= meow). In some cases of Internet slang, people are actually replacing "ł" to "u"… to look cuter, I guess?

ó → it's actually the same as "u" (it's called "u closed")

Apparently, historically it was a long "o", but eventually transformed into plain "u". It's another spelling nightmare for school pupils, seemingly randomly spread between words such as "bród" /brut/ (ford) and "brud" /brut/ (filth).

ż → something like the "si" in "vision"

For funs, in some words this is also written as a digraph "rz". Again, pupils can enjoy distinguishing between "żyć" /ʐɨt͡ɕ/ (to live) and "rzyć" /ʐɨt͡ɕ/ (arse).

Well, there's much more where that came from (say, "ch" vs. "h"), but why am I talking about this particular aspect of diacritics? Because a curious (but horrible, just don't do that) idea of transliterating diacritical letters is to replace them with their phonetic equivalents, or approximations. So, instead of:

Zażółć gęślą jaźń

one could write:

Zarzuuci gensilom jazini

Except that nobody could read this properly, and also it just occurred to me that I've mapped both "ł" and "ó" into "u". Well, let's pretend you didn't see that.

https://en.wikipedia.org/wiki/Polish_alphabet#Letters:_aspect,_name,_value
Author Public Key
npub1xcf8c45mvdddthcrfzdh066wlrp5t0hqy9kr27ey02h2n9vsk2rqnt2zan