The replies to this pinned tweet index my threads on language and linguistics—dealing mostly with sinograms, historical phonology, and etymology.
And occasionally Pokémon.
Languages (focus on East Asia): 🇨🇳🇹🇼🇭🇰🇰🇷🇰🇵🇯🇵🇻🇳 etc.
My book won't be out until January, but right now there's a sale going on at UW Press that applies to pre-orders! If you enter the code WARM24 into the Discount Code field at check-out, you get a whopping 40% off and free shipping within the United States.
Link in thread 1/3
1/ This is the river that divides the Korean peninsula from continental East Asia. It runs along the current border between North Korea and the People’s Republic of China.
What is its name? Depends on which side of the river you are on.
Let’s talk about the commonly-occurring Chinese character 得. You probably haven’t thought too much about it because it’s so familiar. But there is something odd about it. And if you dig into the oddity a bit, you discover that it’s got an unusual and intriguing history.
Here’s something I’ve wanted to do for a while: a 🧵 on Pokémon names.
I’ll be looking at Japanese, Chinese, and Korean components in the English names. We’ll touch on a few other languages, too! 🇬🇧🇯🇵🇰🇷🇰🇵🇨🇳🇹🇼🇭🇰🇩🇪🇪🇸🇫🇷 etc.
🎉 Happy New Year! 🥳
Did you know? All three basic Mandarin pronouns have irregular historical developments.
More interesting: All three give the appearance of being frozen in time — preserving ancient pronunciations.
Let’s check it out!
I recently wrote a thread in which I emphasized that spoken Cantonese 🇭🇰 and spoken Mandarin are distinct, non-mutually intelligible spoken languages. (They are two of the several dozen distinctly spoken languages that make up the Chinese language family.)
🧵
🧵-ing
As an English speaker learning Mandarin Chinese, it took me a long time to understand the difference between the constructions
1) (正) 在 V (呢) (zhèng) zài V (ne)
and
2) V-着 V-zhe (also written 著)
1/
This tweet and post unfortunately reflect a fundamental misunderstanding of the purpose of Unicode.
What anyone subjectively believes is necessary or unnecessary is beside the point. What is attested should be encoded. The character already exists.
1/2
Fish-in-fish matryoshka sinoglyph: Egas Moniz-Bandeira on Twitter/X: It's cute, clever, fun, but do the Chinese need it as part of their bloated (!) writing system? Does Unicode need this inessential / nonessential / unessential sinoglyph as part of the…
They are obviously similar, true, but only about as similar as French is to Spanish. The term “dialect” is an unfortunate misnomer, and when understood with its meaning as a technical term in the field of linguistics it is completely inappropriate. Cantonese is a language.
12/
Jiang is an ordinary, common syllable in Mandarin Chinese. It’s part of so many words, and is the pronunciation of dozens of characters.
There’s Chángjiāng ‘Yangtze River’, jiǎnghuà ‘to talk’, and jiàngyóu ‘soy sauce’, for example.
But incredibly, there is no second-tone jiáng.
I know! How about a thread on Written Cantonese? Yes, it’s the written language that is widely used by millions of people but largely invisible outside its community of users!
2/
Have you ever noticed that there is something strange about the Mandarin word for gas, wǎsī? Looking at the written form 瓦斯, it just doesn’t seem like a normal Chinese compound word. Let’s see if we can figure out what’s going on. 加油! 1/
This is part of a sign in a station on Line
#4
of the Seoul subway system.
It’s trilingual.
Um, it is trilingual, right? Or … is it?
Take a moment. Think about it.
It’s in three different scripts, that’s for sure. But is it in three different languages?
At long last, Part 2 of this thread. We’re thinking about how much we could reconstruct of late 20th-century spoken Cantonese from a vantage point 1,000 years in the future ... if this dictionary were our only available source of information.
After a while I learned what was really going on. It wasn’t just that spoken Cantonese was a distinct language from spoken Mandarin. They are not mutually intelligible— not just because of pronunciation, but because of differences in vocabulary, morphology, and syntax.
11/
👋 Hey-hey! After a long summer delay, it’s time for Part 2 of this 🧵 on how medieval Chinese linguistic structure interacts with Tang poetic form — and on how an understanding of that structure can deepen our appreciation of the poetry.
Let's talk about meter! 👏👏👏👏
A short thread on what I learned from this book, Yóuzhèngzhì Luómǎ Pīnyīn 郵政制羅馬拼音 (Postal Romanization), published in 1961 by the Directorate General of Posts in Taiwan.
(Check out the prices in NT$, US$, and HK$.)
1/ Does all this historical linguistics stuff I keep posting about have any actual application—aside from being inherently interesting? 🙊
A thread on how linguistics can—and should—inform our understanding and appreciation of ancient Chinese poetry.
Lets' go! ⏩
I’ve worked up a minute-long video recitation of a brief passage from the 3rd-century BCE Shāng Jūn Shū 商君書 (Book of Lord Shang) to try to give a feel for what the language might have sounded like around the time these words were first written. 1/
1/ In this thread I’m going to talk about a highly unusual syllable gap in Standard Spoken Chinese, aka Modern Standard Mandarin, which is based on (but not identical to) the pronunciation of the Beijing variety of Mandarin.
A seemingly bizarre editorial error in a Taiwanese children’s book has a lot to tell us about the history of Chinese characters as they’ve traveled from China to Japan—and back again. Let’s dive in and take a look. 1/40
This kind of diglossia in itself isn’t so unusual. A large portion of the human population writes in a language significantly distinct from their spoken tongue. Swiss Germans write High German. Speakers of various regional Arabics write in Modern Standard Arabic.
21/
It’s because the written sentence isn’t in Cantonese. Sure, the reading pronunciations of the characters are in Cantonese, but this isn’t something Cantonese speakers would ever say in normal conversation. It's a Mandarin sentence disguised in a cloak of Cantonese phonology.
20/
I like thinking about familiar things that turn out to have surprising back stories. The modern Chinese second-person singular pronoun 你 (Mandarin nǐ, Cantonese nei5 or lei5) is one of those things. The history of the spoken word and the written character contains surprises.
An additional code point in Unicode doesn’t consume resources or force anyone to use it, and it creates affordances for scholars and others.
One person’s bloat is another’s example of the endless inspirational variety of the literary production of our species.
2/2
But it wasn’t just that I had encountered a different spoken language. It was something I hadn’t known before: spoken Cantonese could be written down. And Written Cantonese therefore differed from Standard Written Chinese in vocabulary, morphology, and syntax.
13/
Let’s talk about radicals in Chinese characters, like 虫 in 蚊, and why they don’t work the way that you think they work.
This discussion will take us outside of China and into parts of the historical “sinographosphere”: 🇰🇷🇰🇵🇯🇵, and especially 🇻🇳.
🧵
I’m using this pinned tweet to keep track of my threads on language and linguistics—mostly focusing on sinograms, historical phonology, and etymology.
Occasionally Pokémon.
Languages: 🇨🇳🇹🇼🇭🇰🇰🇷🇰🇵🇯🇵🇻🇳 etc.
I was familiar with the oft-repeated maxim that “The Chinese dialects are spoken differently but all are written the same.”
And I even knew that it was at best half-true, because Modern Standard Written Chinese is in fact a written form of spoken Mandarin.
4/
Let’s do a little thought experiment about Cantonese (Gwong²dung¹waa² 廣東話).
To be precise: A historical-linguistic thought experiment about modern-day Cantonese.
Ready to expand your mind?
🧠🧐
Ready to solve a puzzle?
This thread features a Chữ Nôm graph with an unusual structure.
On our journey we will bump into the Portuguese word for ‘moon’ and discover some 17th-century Middle Vietnamese sounds that are now lost. 🇻🇳🇵🇹
1/🧵
And it turns out that nobody invented it, nobody standardized it, and nobody taught it (at least, not until recently.) It all developed more or less organically.
24/
My first visit to Hong Kong, as a young American with three years of college-level Chinese language study under my belt, was a bit of a shock. Nobody had prepared me for the reality of Written Cantonese. I didn’t know it existed.
(I love Hong Kong, by the way.)
3/
So how does Cantonese writing work? Let’s explore it starting with this panel from the delightful McMug (麥嘜 Mak6 Mak1) family of cartoons by Bliss. The main characters are a pair of kindergarten-age pigs growing up in Hong Kong.
31/
Little kids growing up in Hong Kong and Guangzhou couldn’t just write down the sentences in their heads. They had to express those ideas with different words and different syntactic structures.
15/
You should always beware of claims made about language or writing that rely on showing you things out of context.
Like: "Chinese writing is so hard! You have to be able to distinguish 日 and 曰!"
Here is a group of four words in four different languages.
Japanese netsu [netsɯ] ‘fever’
Cantonese jit6 [jiːt²] ‘hot, fever’
Korean yŏl [jʌl] ‘fever, heat’
Mandarin rè [ɹɤ⁵¹] ‘hot, heat’
I really like this group of words.
Question: Why do I like these words so much?
By virtue of being native speakers of Cantonese, and knowing the meanings and pronunciations of the characters as used in Standard Written Chinese, Cantonese speakers could puzzle out Written Cantonese rather effortlessly, even on first exposure.
25/
Humans are really really good at isolating either the conventional meaning or the conventional pronunciation of Chinese characters, and then repurposing those same characters to write words in another language.
29/
Let’s explore the answer to this sign puzzle and what it can teach us about how Chinese characters can be (and have been!) repurposed to write other languages. 🧵🇰🇷🇬🇧
Since my last thread was so long, let’s do something short and sweet today.
It’s a puzzle of sorts—a puzzle with a purpose, in service of exploring a larger topic.
What does this sign say?
1/4
Formal writing, taught in schools and used in newspapers, essays, government documents, and so on, was Modern Standard Written Chinese. It was in a way a second language to its users.
14/
So how do Cantonese speakers learn to read Written Cantonese without being formally taught?
I’ve written about this before, in the context of the adaptation of Chinese characters to writing non-Chinese languages:
28/
Standard Written Chinese is taught and learned so early in schools in Cantonese-speaking regions, and is used so commonly in all aspects of daily life, that it doesn’t feel foreign. It’s just a different register of expression.
22/
But none of that explained the questions I had about *Written Cantonese*. What was it doing all around me? When was it used? Who invented it? How did people learn it? And … why?
And ... could I learn it?
23/
Here’s a contemporary example of the kind of ad that I was seeing everywhere in Hong Kong: on the sides of buses, on the walls of underground metro stations, on posters in a shop window.
#香港真係好靚
What does that say?
7/
I delivered a public lecture on April 3 on the mechanisms of script borrowing underlying the adaptation of Chinese characters to write the vernacular languages of Korea, Japan, and Vietnam.
The Zoom feed was recorded and is available for viewing .
In this thread I’m going to talk about one of my favorite etymologies. The history of this word has got it all: it’s a fascinating tale of multi-lingual and multi-cultural interaction, full of surprises. I'm excited, let's go!
Here’s an earlier McMug cartoon. The drawing style is cruder and the characterization hasn’t quite settled. I love how McMug thinks in Cantonese but writes to his pen-pal in Standard Chinese. It perfectly captures the sociolinguistic functions of the two written languages.
52/
They learn its conventions from observing the community of users around them without needing any formal training. A written sentence like this—“佢而家喺邊喥呀?”—is easily understood to write that spoken Cantonese sentence meaning ‘Where is she now?’.
26/
I will tell you the story of how this quiz came to be.
In telling the story, the answer will be revealed.
Then I'll say something about what I've learned from this exercise (and from you all).
5) I’ll plug here my book Sinography, which is all about describing these adaptation processes and the reasons why they recur over and over again throughout history.
(And: It’s refreshingly inexpensive for a Brill publication.)
59/
2) Cantonese has a well-developed written language. It’s not formally taught, but because it relies on universal principles of script adaptation, native speakers pick it up easily. I’ve talked about these universals before (see
#2
,
#5
, and
#24
here: )
55/
I’m using this pinned tweet to keep track of my threads on language and linguistics—mostly focusing on sinograms, historical phonology, and etymology.
Occasionally Pokémon.
Languages: 🇨🇳🇹🇼🇭🇰🇰🇷🇰🇵🇯🇵🇻🇳 etc.
And that proved to be true in a lot of situations. But I remember a feeling of shock and vertigo my first few days in the city, seeing all around me unfamiliar characters and undecipherable signs—especially advertisements.
6/
To fully understand it, you have to do more even than recognize all the characters. You have to understand the words and the grammar—which means you have to know spoken Cantonese. Having proficiency in spoken Cantonese is a prerequisite to reading written Cantonese.
34/
35/ If I’d started this thread by telling you that the Manchu river name yalu ‘boundary’ was borrowed into Korean as amnok ‘duck-green’, you’d have thought I was nuts. 🥜
But I’m not.
I’m not, right?
/end
“佢而家喺邊喥呀?” is a direct written representation of “Keoi5 ji4gaa1 hai2 bin1dou6 aa3?“.
Unlike “他現在在哪裡?“ which is a *translation* of the sentence into Mandarin.
27/
1/ In 1974, archeologists began to excavate sites at the 4th-century capital city of the small, ancient state of Zhongshan 中山, located in modern-day Pingshan 平山 County, Hebei Province. Among the many objects excavated from grave sites were magnificent inscribed bronzes.
And speakers of that other language are really really good at puzzling out the intended meaning, even without training or explanation, because they share the same knowledge base as the writers.
30/
If you then delivered that written sentence to another Cantonese speaker and asked them to read it out loud, you would hear:
“Taa1 jin6zoi6 zoi6 naa5leoi5?”
18/
Last week I tweeted about the Mandarin Chinese names of the letters of the Latin alphabet. I would like to give equal time to an exploration of the 🇰🇷Korean🇰🇷 names of the letters of the Latin alphabet. There are some fascinating mysteries to explore here.
What’s the connection between a 2,000-year-old seal-script character, an early 20th-century collection of translated short stories by Lǔ Xùn and his brother, and the aborted 1977 second-round simplification of Chinese characters?
(Thx to
@chowleen
for inspiring this thread)
1/
The bottom-most horizontal stroke of 目 detached, floated down, and attached to the top of 寸. That left the top part looking like 日. In other words, 𥃷 became 㝵. The top part of 㝵 isn’t a sun! It’s a cowrie shell that lost its legs and then lost its lowest horizontal stroke.
Last month I presented seven sentences in seven different languages, all written in a form of the Chinese-character script. The challenge was to identify the languages and, if possible, provide a translation.
🅻🄰🅽🅶🆄🄰🅶🅴 🆀🆄🅸🆉
The following sentences are in seven different languages, all written in Chinese-character script (or a modification of it). Can you identify the languages?
Sentences are in thread.
(1/3)
12/ Here is the ordering of consonants found in Hunminjeongeum (I'll abbreviate the title as HMJE from now on).
ㄱ ㅋ ㆁ ㄷ ㅌ ㄴ ㅂ ㅍ ㅁ ㅈ ㅊ ㅅ ㆆ ㅎ ㅇ ㄹ ㅿ
(Start with the image on the right, look for the "ㄱ" at the top, and then proceed right to left.)
So here’s a seemingly simple question with a surprisingly complicated answer: How do you say “6” in Korean?
As native speakers are well aware and as beginning students of Korean quickly learn, there are at least two ways.
1/ In response to my previous thread on poetic meter, several people asked what I think is the best modern Chinese language for reciting Tang poetry.
It’s a great question! And my answer is …
👋 Hey-hey! After a long summer delay, it’s time for Part 2 of this 🧵 on how medieval Chinese linguistic structure interacts with Tang poetic form — and on how an understanding of that structure can deepen our appreciation of the poetry.
Let's talk about meter! 👏👏👏👏
1) Educated, literate speakers of any Chinese language have learned how to read and write Standard Written Chinese, which is based on spoken Mandarin.
But that’s not the same as a written form of their spoken language.
54/
I didn’t expect that I would understand Cantonese speakers, but I assumed that I’d be able to make my way effectively around the city because of my knowledge of Chinese characters and Chinese writing. (Because Standard Written Chinese is Standard Written Chinese everywhere.)
5/
So here are the basic rules for how the Chinese script has been adapted to write spoken Cantonese. There are only two.
Well, three, because Rule 2 has two parts.
Well, four, because there are some characters that don't fit the three rules.
But let's just say two rules.
35/
3/ Later, after I’d become more sophisticated about Chinese and Korean language history, I realized that they are historically the same name: the Mandarin and Korean pronunciations of 鴨綠/鸭绿 meaning ‘duck green’.
🅻🄰🅽🅶🆄🄰🅶🅴 🆀🆄🅸🆉
The following sentences are in seven different languages, all written in Chinese-character script (or a modification of it). Can you identify the languages?
Sentences are in thread.
(1/3)
Maybe the sign is entirely in Korean, represented in three different written forms: in the Roman alphabet, the Hangul alphabet, and Hantcha (i.e. Chinese characters).
So maybe it’s not trilingual at all, but monolingual and *tri-scriptal*! 🤯
4) The way the Chinese script had been adapted to write Cantonese isn’t so different from the way it’s been adapted to write Korean, Japanese, Vietnamese, and other languages throughout history.
57/
🎉Time for a sinogram quiz!
We’re going to need people with a variety of linguistic abilities and an interest in Chinese characters/kanji/hanca/Hán tự/etc.
This 🧵 is going to be wild, I promise! Lots of languages, lots of crazy-looking characters like this one. 1/
People never say "English writing is so hard! You have to distinguish O from 0, and S from 5, and d from b, and p from q."
Or "English writing is so hard! You have to distinguish "inveterate" from "invertebrate!"