Alien language arrives, but human language is still cool!

The recent sci-fi film Arrival has re-brought linguistics into the public’s eyes, with an alien language that is powerful enough to reshape speakers’ minds and enable them to see the future. While linguists outside the screen are not so sure if alien languages are really like that, our own language capacity is potentially spectacular enough to impress the arriving aliens and inspire a movie on their planet. [1]


Language is something that everyone uses in our daily life. It comes so naturally to us that we seldom stop and think about its marvelous intricacy. At a distance, the human language is no less a godly design than the physical universe; in fact, it offers us a mental universe. While sci-fi lovers enjoy imagining time travel and parallel universe, these features have always been there in the mechanics of our language. We can talk about things that happened a million years ago and those that might happen in a million years, and we can also propose various possible outcomes in different possible worlds under miscellaneous conditions. Moreover, in a sense, the thousands of different languages spoken in the world are themselves one and another “parallel universe”, where a tiny parametric change may lead to grammatical systems as different as English and Japanese.

Although the word “language” is etymologically “tongue”, the diversity of human language is not restricted to the tongue. While language is a human nature rooted in our minds, it can find its way out through various channels (a procedure called “externalization”). Theoretically speaking, every sensory modality can develop its own language form. The auditory modality supports spoken language, the visual modality supports sign language, and the recently discovered haptic language is presumably supported by the somatosensory modality. Yet, all these amazingly different language forms eventually lead to the same (or equivalent) meanings in the conceptual-intentional system. We use different languages, but we can understand each other.

There is a recent Cambridge Festival of Ideas short firm called Talk with Your Hands: Communicating Across the Sensory Spectrum, which nicely explores the idea of cross-modality language experience. Through a series of interviews, the producers –  Craig Pearson, Julio Chenchen Song, Toby Smith – present to the audience how language is experienced in different modalities and briefly introduce the grammatical mechanisms of sign language, which features simultaneous externalization of multiple words unlike the linear externalization (one word at a time) in spoken language. Although this sensorimotor non-linearity is not as magic as the non-linearity of the alien language in Arrival (which presumably involves a non-linear conceptual-intentional system), it is already powerful enough to endow the level of flexibility to sign language that spoken language can only covet.

As Dr. Matt Davis (neuroscientist) and Dr. Víctor Acedo-Matellán (linguist) point out in the short film, language is the capacity that sets us apart from all other animals, and sign language is not a direct translation or substitute for spoken language, but a completely independent natural language. Indeed, all natural languages are equally well-equipped to express the infinite ideas human beings want to articulate. The Cambridge Shorts communications officer writes: “in just ten minutes, Talk with Your Hands conveys the richness of verbal and non-verbal languages and explores how our senses overlap and merge.”

For the arriving aliens, language is a weapon; for us it is the same. If language is the foundation of human intelligence, then everything that comes with language, including its diversity, flexibility, and creativity, are direct reflections of intelligence. Can aliens or AIs also possess such a language system? The former, we don’t know; the latter, let’s wait and see.


[1] Picture source:

Singular ‘they’ is the new black

After singular they was declared word of the year for 2015, the word attracted unprecedented strong online reaction. Despite the split opinions seen and heard on and outside the Internet, my recent research in Australia found that the use of they as a singular pronoun is gaining approval as a ‘correct’ way to speak.


277 people across all different age groups were asked to provide their evaluation of ‘correct/ incorrect/ I don’t know’ over 5 sentences, condensed in the form of a listening task and a written task. Results show an interesting contrastive trend for spoken and written data. While most participants judged singular they as ‘Correct’ in spoken discourse, the opinion is divided among native speakers and non-native speakers when it comes to written sentences, as illustrated in Figure 1:


Figure 1: Assessment of Singular They by native and non-native speakers in written discourse

Such even split proportion across both groups of speakers is perhaps the evidence of a linguistics feature in transition, a point where a part of society is less resistant, and others are more so towards the change. In other words, singular they stays ‘below the radar’ but is still more or less noticeable.

Interestingly, almost 80% of participants recognised singular they (and its genitive form) as a preference to stay ‘politically correct’. When being asked to assess the supposedly correct form of pronoun in  the sentence ‘Each student must write his ID number on the exam’, speakers’ discomfort with the ‘sexist implication’ in the word ‘his’ was prevalent and clearly expressed. Many respondents suggest ‘their’ as a ‘better way to say it’, or unless ‘it is a room full of boys’. The use of singular they then can be seen as speakers’ solution to the inherent lack of a singular, gender-neutral third person pronoun in English. The need becomes even more pronounced in wake of the ongoing social and global effort to advocate for gender inclusion and equality.
It is also quite interesting that the speakers’ uncertainty about Singular they (manifested in their answer ‘I don’t know’ ) is most observed within the middle pack of the Age groups (18-30 & 31-65), while being completely absent in the other two ends of the spectrum (Under 17 and 66 plus). How come? I am inclined to hypothesise that speakers who are within their socially active period (working, adult life) are more likely to be flexible in their judgement of what constitutes linguistically ‘standard’, possibly due to their ongoing, consistent exposure to various ways of speaking and writing.

As one of my participants neatly summarised, ‘Everyone does that [i.e. using they and its genitive form as singular pronouns], even though they don’t know they do.’  The use of singular they in particular, and the ‘standard form’ of English generally, then perhaps should be seen as a fluid, relative concept rather than a hard and fast set of rules that dictate language use. Many grammar prescriptivists may still disagree, but of course, everyone is entitled to their opinion.

But Chinese IS an SVO language.

Ask Chris: I’m a native speaker of English and currently learning Mandarin but the word order seems to be different from typical SVO languages like English. For example,

‘把书本拿给我’ (give me the books) – SVO should be “拿书本给我”

‘从家里出去’ (leaving from the house) – How about “出去从家里”

‘他在家里睡觉’ (He sleeps at home) – Why is it not “他睡觉在家里”

Many of these sentences in Mandarin put the verb at the end after the object, instead of putting it before the object. So I really wonder if Mandarin Chinese is a real SVO language.

(Attention: This article contains Chinese text. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Chinese characters.)


Chris answers: … Well I really did not expect that someone whose native language is not Chinese would like to ask me a question. Let’s talk about the issues in a straightforward way, just to make everything clearer. Basically you got two problems in your descriptions – yes, those are your problems. In your first sentence, you mix up the ba structure and the unmarked order of Standard Mandarin Chinese. That is quite common, even for a beginner syntactician who has little or limited knowledge of Standard Chinese; so that is not your fault, and just take it easy. But your second problem is that you do not – or maybe just fail to – differentiate objects and prepositional phrases; that will be a horrible problem. Considering that last year I received a question about the word order of Archaic Chinese and this time I got yours, I would like to provide a somewhat up-to-date summary of the history of word order of Chinese, set in an earlier Chomskyan view called ‘Principles and Parameter’”, and drawing from my intuition as a Chinese native speaker.

The most common, unmarked order of the Chinese language, including both Standard Mandarin Chinese and Archaic Chinese, is SVO; that is a historical issue I would like to address before we move to the analysis of Standard Mandarin Chinese. Once there was an argument about the SVO or SOV order of Archaic Chinese. Li and Thompson (1974), for example, hold the opinion that Archaic Chinese should have SOV order, since quite a few of the Sino-Tibetan languages now have that order. Little evidence, however, is discovered for that argument; most of the literary records of Archaic Chinese, especially the pre-Qin documents, demonstrate the features of an SVO language rather than an SOV one. Granted, Archaic Chinese has some structures that follow SOV order, but as far as I can tell, none of the structures is fully ‘unmarked’. For example, an unmarked sentence favours nominal structures as its subject and direct object, and no additional particle should be present in the sentence, but those sentences with SOV order either include a particular particle that triggers the SOV structure, or contain a pronoun as its direct object. In a typical unmarked sentence, such as ‘孟子(S)见(V)梁惠王(O)’ (Mencius met (or visited) King Hui of Liang), you can see that both the subject and the object are nouns (proper names), and the order is definitely SVO. The feature of SVO has been inherited by Middle Chinese, and then Standard Mandarin Chinese today.

As for Standard Chinese, an unmarked sentence, like ‘我今天吃了个苹果’ (I today eat-PAR CLA apple), is also definitely SVO; even for sentences containing a pronoun as the object, like “我揍了他” (I punch-PAR him), the word order is still SVO. Without the presence of a word like ‘把’ (ba) and ‘被’ (bei), and without the movement of the topic, the word order is always SVO; you can never find a sentence presenting an SOV order. This is the common view  shared by syntacticians in Europe (Continental and UK); maybe those horrible Americans have some other ideas that I don’t know of.

So that argument answers your final question: yes, Standard Mandarin Chinese is an SVO language.

So how can we interpret your sentences in that way? Before I go further into the field of Chinese, I would like to stray a bit and talk about another language: German.

German main clauses demonstrate a variety of word orders, including SVO, VSO, OVS, SAOV, and even OASV if you pay enough attention to the language (A stands for an auxiliary verb, like ‘did’ or ‘have’ in English). But most commonly it is known as having V2-SOV order; that is, in a main clause, the verb will always be in the second position, while it will be at the end of a subordinate clause. I mention the structure of German, mainly to point out that it is possible for a language to have numerous word orders besides its unmarked order; the world is not made up of English, which only allows a limited number of word orders.

Let’s move back to Chinese then. Example 1 in the question is quite a complicated one involving both a double-object construction and the presence of ba; so it will take me longer and more examples to illustrate the variations in Standard Mandarin Chinese. Firstly, I would like to discuss the double-object construction, which means that the main verb is ditransitive, has three theta-roles, and requires three nouns (or nominal constructions) in a grammatical sentence; the three nouns will be the subject (S), direct object (DO), and indirect object (IO) respectively. For a typical ditransitive verb, like “给” (give), the unmarked word order will be:

(1). 他给了我一本书。
He give-PAR me one-CLA book. – in the form of S-V-IO-DO.

In this case, the word order of Standard Chinese is exactly the same as that of English, which is a typical SVO language: ‘he gave me a book’. The ditransitive verb construction is usually referred to as a VP-shell structure; for those who are interested in how Chomsky and his fellows solve the problem of the theta-role assignment of a ditransitive structure, just type this word into Google and you will get piles of literature.

Then we talk about the application of ba in constructions with a transitive (not ditransitive) verb. The nature of ba is rather complicated and even controversial; here I would like to follow the light verb assumption (gladly we have Julio’s post on light verbs; that saves my day), in which the nature of ba is a bit similar (but not equal) to the auxiliary verb in German. A light verb is a verb (of course), but its function is more like a particle which leads to the shift of structure within the sentence, e.g. focus, stress, and so on. For a typical transitive verb, like ‘吃’ (eat), both unmarked and ba-construction sentence express the same meaning:

(2a). 我吃了一个苹果。
I eat-PAR one-CLA apple.  – in the form of SVO.

(2b). 我把一个苹果吃了。
I ba one-CLA apple eat-PAR. – in the form of S-ba-O-V

(2c). *我一个苹果吃了。
*I an-CLA apple eat-PAR. – in the form of SOV. An asterisk indicates that the sentence is ungrammatical to native speakers like me.

Both (2a) and (2b) are grammatical and acceptable by a native speaker; (2c), a pure SOV order, sounds horrible unless uttered with a particular intonation pattern. We can see that the SOV order is possible only when ba is present.

Then we put the two together and go to the example 1 in the question, which is a combination of the double-object  and ba-constructions. For the ba-construction of (1), we can list a set for comparison:

(3a). 他给了我一本书。 (a replica of example (1).)
He give-PAR me one-CLA book. – in the form of S-V-IO-DO.

(3b). 他把一本书给了我。
He ba one-CLA book give-PAR me. – in the form of S-ba-DO-V-IO.

(3c). #他给了一本书我。
# He give-PAR one-CLA book me.  – in the form of S-V-DO-IO. I use the hash here because in some Chinese dialects, e.g. Cantonese, the structure is perfectly acceptable, and I also see some of my Hong Kong friends using it in their variety of Standard Chinese; I would like to ignore it here because I’m talking about Standard Chinese, but the sentence is not that ‘standard’.

(3d). *他把我给了一本书。
* He ba me give-PAR one-CLA book. – in the form of S-ba-IO-V-DO.

For other possible structures, most of them are ungrammatical and unacceptable to native speakers; that includes all the structures in which V is at the very end of the sentence, including but not limited to S-IO-DO-V and S-ba-DO-IO-V. There should be at least one component after the main verb of the sentence for it to be grammatical; for Standard Mandarin Chinese, it can never be the case in a double-object construction that a sentence is a pure SOV sentence – which is rather convincing in showing that the structure of DO-V in both transitive and ditransitive sentences is a result of triggered movement, rather than a based-generated structure.

For a typical SOV language – unless certain extraposed structures are present (which is too complicated for non-syntacticians) – all objects should be in front of the main verb. At the same time, it is grammatical for an SOV language, like Japanese or German, to exchange the position between IO and DO – a process called ‘scrambling’. Scrambling, as a widely-present feature among OV languages, is impossible in Standard Chinese; this can be another piece of evidence, although the logic is not fully convincing. Sad story.

As for the remaining two examples, ‘从家里出去’ (from home-in go out) and ‘在家里睡觉’ (at home-in sleep), they belong to the problem of propositional phrases (PP) rather than that of objects. In Archaic Chinese, the position of a PP in relation to a main verb is rather flexible (or in Chomskyan terminology, a “free parameter”), but in Standard Mandarin Chinese, a PP attaching to a VP is more frequently a pre-verbal one. This is definitely interesting – but also more difficult; so if you think the following content may be beyond your grasp (which, actually, is the case for some of my classmates in our linguistics masters programme), just skip it and jump to the end.

Contemporary syntactic theories, especially the Chomskyan one, assume that any given sentence forms a tree structure, the large tree containing a series of smaller trees, corresponding to phrases. For every phrase, there is a head (the most important word), a complement providing the essential part of the rest of the phrase, and a specifier which is more or less like a modifier. For a VP including a transitive verb, the verb itself is the head, the object is its complement, while the adverbial (e.g. a PP) is its specifier.

In Standard Chinese, most of the time, the specifier is in front of the head, and that structure is less flexible. For instance, an NP (noun phrase) is in the form of AP-N, which is why we say ‘美丽的|姑娘’ (beautiful girl) rather than ‘姑娘|美丽的’ (girl beautiful); this is the same in the construction of the VP, so that we put the PP in front of the VP, and say ‘从家里出去’ rather than ‘出去从家里’, or ‘在晚上|看电视’ (in the evening watch TV) rather than ‘看电视|在晚上’ (watch TV in the evening).

Since we are talking about the specifier but not the complement when we discuss the PP-V construction in Chinese, that structure can never be used to argue for an SOV structure. The example ‘在晚上|看电视’ (in the evening watch TV, in the form of PP-V-O), on the contrary, obviously shows that Standard Chinese is a VO language.

I will not further extend my argument into an analysis of Greenberg’s Linguistic Universals, in which he suggests that a VO language always has head-complement order (while Chinese does not); that will be beyond the reach of 99% of my readers, and may even earn me a page in Linguistic Inquiry. I hope that it is clear enough for you to know why Standard Mandarin Chinese is an SVO language rather than an SOV one: basically, what you regard as an OV structure is not fully qualified. Next time try some more delicate examples and you are always welcome to ask me more questions.

Enjoy learning Chinese! All the best, Chris.


You do not need to learn Chinese in order to be a syntactician. Try reading these:

Greenberg, Joseph H. Language universals: With special reference to feature hierarchies. Walter de Gruyter, 2005.

Haegeman, Liliane. Introduction to government and binding theory. 1991.

Li, Charles N., and Sandra A. Thompson. “An Explanation of Word Order Change SVO→SOV.” Foundations of Language, vol. 12, no. 2, 1974, pp. 201–214.

Peyraube, Alain. “On word order in Archaic Chinese.” Cahiers de linguistique-Asie orientale 26.1 (1997): 3-20.

Finnish, Hungarian… and Sumerian?

When you tell non-linguists that you do linguistics, people often like to showcase their lay interest in the subject by throwing out cool facts about languages. Only sometimes the facts are not so factual. A few weeks back, over tea, I was told that

“I mean it’s quite interesting how Finnish and Hungarian are related to Sumerian.”

I managed not to splutter my tea nor choke on my scone; I dismissed the claim as an interesting but very obviously false belief, going in the same bin with the likes of ‘some languages make you more intelligent than others’ and ‘modern English was language of the Roman Empire’.

But a real tea-spluttering, choking-by-scone incident was soon to follow, when the same claim cropped up not as a harmless fantasy but as state-of-the-art research by the Finnish assyriologist Simo Parpola.

In his brand new, two-volume Etymological Dictionary of the Sumerian language, Parpola argues that Sumerian, the ancient language of southern Mesopotamia, is not, in fact, an isolate without any known linguistic relatives, as is conventionally held; rather, it is related to the Uralic languages, spread across northern Eurasia.

Could Finno-Ugric speakers now take ancestral pride in cuneiform? Image credit: hapal.

Could Finno-Ugric speakers now take ancestral pride in cuneiform? Image credit: hapal.

Eleven years of comparing Sumerian to several major language families revealed, according to Parpola, that Sumerian and the Uralic languages have more in common than the Finno-Ugric languages (a subgroup of the greater Uralic family) among themselves in their sounds, basic vocabulary, pronouns, and numerals. For example, the Sumerian ajja/aj(a) ‘father, grandfather’ resembles the Uralic forms äijä, ajja, aja, and jäj with similar meanings, while engur or angur ‘primeval sea, home to the water deity’ is akin to onkura ‘cave’.

Not surprisingly, the research has resulted in uproar in the Finno-Ugric linguistic circles, with linguists attacking both Parpola’s methodology as well as his professionalism as a linguist (what does an assyriologist know about the scientific methods of linguistics?). For example, the Sumerian-Uralic comparisons use whole language families, but in historical linguistics comparison sets are usually set up between individual languages. Juha Jantunen, professor of East Asian languages and cultures at the University of Helsinki, summarizes the linguists’ horror at Parpola’s findings: saying that Sumerian is a closer relative of Finnish than the Finno-Ugric Sami is like saying that humans are not, in fact, that closely related to apes but find themselves rather somewhere between the horse and the donkey.

Human relatives – are horses or monkeys closer? Image credit: Ashley Van Haeften.

Human relatives – are horses or monkeys closer? Image credit: Ashley Van Haeften.


Parpola is not alone in his conviction, though. The idea that the humble Uralic languages have grander ancestors has been around since the mid-19th century when the cuneiform script was first studied. It was soon noted that Sumerian was agglutinating – sticking meaningful bits of language together into longer words rather than using separate words to express, for example, location – just like Hungarian. Combine that with a national impetus: at the time, Hungarians were a minority in the mighty Austria-Hungary, and what nicer way to gain some kudos than establishing grand linguistic relatives?

What’s more, with a renewed interest in nationalism, alternatives to the standard Finno-Ugric story still live strong in Hungary: after the national conservative Prime Minister Viktor Orbán came to power in 2010, more funding has been directed to research exploring options such as Parpola’s.

That’s not to say Parpola’s findings are in any way influenced by nationalistic motives. But are his results victim of linguistic prejudice, wrongly criticized by linguists stuck in their cozy standard theory? Or is Parpola after all just a misinformed assyriologist? If there is even a grain of truth in Parpola’s arguments, much of linguistic history will have to be rewritten – and I will never dismiss my non-linguist friends’ factoids as funny misconceptions again.

An olympic olá!

I guess I’m not the only one to have succumbed to a bit of addictive late-night Olympics watching this weekend. The British Council has taken advantage of all this Rio enthusiasm to encourage us to learn Portuguese, as part of its campaign this year to get more Brits learning languages. Its survey found that only 2/5 of their sample of around 2,000 UK adults knew that Portuguese was the language spoken in Brazil. Thanks to some cousins who grew up there, I would have passed that one, but I’m not so sure about all the other South American countries (and they call me a linguist!). Of course, Spanish is a safe bet (and indeed, it is an official language in Argentina, Bolivia, Chile, Colombia, Ecuador, Paraguay, Peru, Uruguay and Venezuela), but what about Suriname and Guyana? That would be Dutch and English. And then there’s French Guyana (clue’s in the name).


But hang on, back in Brazil, Portuguese is only half the story. There are in fact 216 languages spoken in Brazil, many of them – 201 in fact – are indigenous, spoken there long before the Portuguese arrived in 1500. Take for example Apalaí, Arara, Bororo, Guarani, Nheengatu, Terena, Tucano, and Xavante. But there are also scatterings of other European languages, especially German (and a German variety, Pomerian) and Italian, and in some places these, and indigenous languages, now have co-official status.

Well, you might be thinking, I don’t mind picking up a few phrases through the viewing marathon of the next couple of weeks, but if I’m going to actually make a good stab at learning a language, is Portuguese really one to go for? According to some other research coming from the British Council, it actually comes in at number six in a list of ‘languages of the future’ – ones that will be most useful for the UK in the coming years. And Portugal is a popular holiday destination too (but remember, that the Portuguese spoken in Brazil and the Portuguese spoken in Portugal are their own varieties, with differences in pronunciation, lexicon, and grammar). Besides, the benefits of multilingualism seems to be everyone’s favourite topic these days, so perhaps we don’t need to angst so much about which language we learn but just get on with learning a language (more seriously, you may have seen both ‘bilingualism-is-the-cure-for-all-ills’ champions and ‘there-ain’t-no-difference-between-mono-and-bilinguals-after-all’ brigades, but this, as Thomas Bak at Edinburgh reminds us, reflects how many complicated factors are involved in multilingualism, in the life of any human, and is a reason to pursue more scientific research, rather than throw black-and-white eggs at each other. However, what can be said with some confidence is that, while speaking more than one language is the norm for a majority of the world’s population, where it is a choice, it often brings great social insight, cultural enrichment, and personal satisfaction.)

CC Cidade_Candidata

CC Cidade_Candidata

But back to Portuguese. Perhaps you’re not in the mood to trip over to Portugal and Brazil any time soon. But I was reminded of a wonderful infographic showing the second most common language (after English) at each tube stop in London. Looking at that, you can see that a trip to Clapham, Stockwell, or Willesden Junction would also give you an opportunity to try out your Portuguese. And then I’ve just read in this month’s Babel about a company that offers, among others, a ‘Brazil & Portugal Tour’… in London! So maybe the British Council’s suggestion isn’t such a bad idea after all.

“By the media criticizes you”?! Bei, ba, and light verbs in Chinese

The languages we speak are never without little surprises that remind us of the complexity and flexibility of our minds. For instance, when I was rewatching an old episode (originally aired in 2005) of my favorite TV show, 100% Entertainment, the other day, I heard one of the hostesses, Barbie Hsu, say:

(1) … hai    bei    mei-ti     ba     ni       ma     de   gou-xue-lin-tou.

———and   BEI    media   BA     you   scold  DE  dog-blood-pour-head

———“…and then you also get harshly criticized by the media.”

The sentence sounds alright, except that it includes both bei and ba. What’s wrong with that? Well, in Mandarin Chinese bei is used in passive sentences to introduce the active subject, and ba functions like a transitive/accusative marker on the object. Their uses are illustrated below (PFV = Perfective, ACC = Accusative):

(2)  a.   Gou-gou       chi        le         ping-guo.

———-puppy           eat       PFV     apple

———-“The puppy ate apple(s).”

       b.  Ping-guo         bei               gou-gou           chi       le.

———-apple              BEI               puppy             eat       PFV

———-“The apple(s) was eaten by the puppy.”

       c.  Gou-gou        ba              ping-guo          chi       le.

———-puppy           BA              apple              eat       PFV

———-“The puppy ate the apple(s).”

(2a) is the neutral way of reporting this event, where the subject is ‘the puppy’ and the object is ‘apple(s)’. In (2b), ‘apple(s)’ becomes the syntactic subject, while ‘the puppy’ is introduced by bei (similar to English by), thus rendering a ‘passive’ sentence. In (2c), the subject is again ‘the puppy’, but the object ‘apple(s)’ takes ba, which explicitly clarifies it as the affected patient of ‘eat’ and entails an agent subject (the eater in this case). That is, when the object is introduced by ba, the verb is unambiguously transitive and active.

So, taking these into consideration, Barbie Hsu’s sentence in (1) is literally something like:

(3)        “and then by the media harshly criticizes you.”

As you can see, it doesn’t work in English. But why is it okay in Chinese? Before answering this question, let’s look at some similar examples, which all suggest that the coexistence of bei and ba is not only okay, but also productive.

(4)    a.     Xiao-yang        bei           lao-hu   ba      tou        yao      xia-lai    le.

————-little-lamb       BEI           tiger      BA      head    bite      down   PFV

————-“The little lamb got its head bitten off by the tiger.”

         b.    Feiwen    bei         lao-shi     ba      man-hua-shu     mo-shou           le.

————-Feiwen   BEI         teacher   BA      comic book       confiscate        PFV

————-“Feiwen had her comic book confiscated by the teacher.”

         c.    Lianlian        bei       xiao-dao         ba       shou     ge-po                 le.

————-Lianlian        BEI     small-knife      BA      hand    cut-broken       PFV

————-“Lianlian got her hand cut by the small knife.”

This pattern is reminiscent of a construction in Japanese called the ‘suffering passive’, as in (5) (TOP = Topic, PASS = Passive, PST = Past, GEN = Genitive).

(5)     a.   Watashi-wa    imooto-ni                   tokee-o            kowas-are-ta.

————I-TOP              younger sister-by     watch-ACC     break-PASS-PST

———–“I got my watch broken by my younger sister.”

          b.  Watashi-wa     tonari-no        hito-ni         ashi-o         hum-are-ta.

————I-TOP              next-GEN       person-by   foot-ACC    step-PASS-PST

———–“I got my foot stepped on by the person standing next to me.”

Again we see the coexistence of an oblique (i.e. non-subject) agent and an accusative patient. Furthermore, in Japanese the passive voice is morphologically marked (-are-). Note that ‘I’ in (5ab) is not a subject, but the discourse topic. Intuitively this also seems to be true for the Chinese sentences in (4), whose English translations should be rendered more accurately as:

(6)    a.   “As for the little lamb, BEI the tiger bit off its head.”

——-b.  “As for Feiwen, BEI the teacher confiscated her comic book.”

——-c.  “As for Lianlian, BEI the small knife cut her hand.”

I have let BEI stay as such, because, despite the similarity, it is not an exact counterpart of English by (as used in passive constructions). Bei is historically verbal (meaning ‘to cover’), which is still reflected in some words and phrases today, as in (7).

(7)    a.  bei-zai, bei-nan “be covered by/suffer from disaster”

      b.  ze bei tian xia/wan shi “benefits cover the entire world/many generations”

Apart from the verbal meaning, bei also has a nominal meaning ‘cover’, which is well preserved today, as in bei-zi ‘quilt’, mian-bei ‘cotton quilt’, among others. Of course, in the ‘passive’ construction, bei no longer has these literal meanings. Nevertheless, it is plausible that it still retains some verbal properties. In other words, what matters here for bei is not the literal meaning, but the syntactic category. Belonging to the verbal category, bei can still join the verbal predicate quite freely and does not have to do this as a complement or adjunct (as the English by-phrase does).

In fact, the patient introducing ba has a similar status in Chinese. It also originated as a verb (meaning ‘to hold’) and has developed a usage as a bleached verbal category. Actually the verbal use of ba today is even clearer than that of bei, as in (8).

(8)    ba-jiu “hold the alcohol”, ba-men “hold the door”, ba-quan “hold the power”, ba-chi “hold and keep”, ba-shou “hold and guard”, ba-wan “hold and play with”, etc.

Elements like bei and ba are sometimes called light verbs, i.e. they are light in meaning but still belong to the verbal category and perform some verbal functions. Bei is a light verb that introduces the agent or instrument argument, while ba is a light verb that introduces the patient argument into the predicate. They can coexist in Chinese simply because they both exist in the language, as in (9a).

(9)   a. Chinese:  Hai    bei-mei-ti           ba-ni                ma de gou-xue-lin-tou.

———————–and   AGENT-media   PATIENT-you  scold harshly

b. English: And then youPATIENT also get harshly criticized by the mediaAGENT.

By comparison, since English does not have argument-introducing light verbs like bei and ba, when the verb is inflected in passive (i.e. when it lacks the accusative case assigning layer, which can be conceived as a null light verb), the original object (‘you’) must be licensed otherwise. In this case it becomes the ‘subject’, and the original subject (‘the media’) subsequently gets licensed in the prepositional by-phrase, as in (9b). In sum, we can say that bei and by have similar semantic functions but distinct syntactic categories, and that Chinese does not really have an English type passive construction.

Now you may wonder where the subject (or more exactly the ‘topic’) is in (10a). Well, it is simply omitted because its information is clear in the context. When you are scolded, the default topic is clearly YOU! Indeed, (10a) can be recovered as (11), which is exactly what Barbie Hsu means.

(11)  NiTOPIC    hai       bei      mei-ti     ba       ni          ma de gou-xue-lin-tou.

—— you            and      BEI     media     BA     you       scold harshly

Why Chinese systematically uses topics as part of the sentence whereas English does not is a whole other story. A relevant fact is that if we omit the topic elements in the Japanese examples in (5), the sentences are also well-formed, as in (12).

(12)    a.  Imooto-ni tokeeo kowas-are-ta.

————“(My) watch was broken by (my) younger sister.”

           b. Tonari-no hito-ni ashi-o hum-are-ta.

————“(My) foot got stepped on by the next person.”

Of course, bei and ba are not the only two light verbs in Chinese; there are still many more, such as yong (introducing an instrument), gei (a casual alternative of bei), jiang (a formal alternative of ba), and so on. Maybe if you join me to watch TV, you will notice Barbie Hsu and her sister Dee Hsu frequently use other light verbs as well. Language is our most faithful companion, and we can surely find out many marvelous facts about it if we pay a bit more attention and… study some syntax!

Five languages from Spain you never knew existed


Spain, known as the land of sol, siestas and sangría, is less well known for the diversity of its linguistic heritage. Though most people could probably identify Basque and Catalan as languages spoken in Spain, the Iberian Peninsula is home to a number of minority languages and dialects you’ve probably never heard of.

1. Galician
Even though it has 2.4 million native speakers and is Spain’s third most spoken language, it’s surprising how many people have not heard of Galician. That’s maybe because Galicia’s most famous residents – including current Spanish PM Mariano Rajoy, the 20th century writer Ramón Valle Inclan, and Spain’s late dictator Francisco Franco – are known for being Spanish, not Galician, speakers. Galician and its neighbour across the border, Portuguese, were originally one and the same language, Galician-Portuguese, a highly prestigious medieval language famous for its lyric poetry. Although Galician then became a low prestige language for many centuries, today it is co-official with Spanish in Galicia and has its own publicly-funded television channel.

2. Aragonese
Aragon, the land of Henry VIII’s first wife and the 18th century painter Francisco de Goya, is also home to the luenga aragonesa, or Aragonese language, which descends from the now extinct medieval language Navarro-Aragonese from North-East Spain. Aragonese has a core of native speakers in Aragon’s remote Pyrenean villages, but is understood by many more people in the surrounding areas, and is mutually intelligible with neighbouring languages such as Castilian Spanish and Astur-Leonese. Aragonese is protected by local laws, and has its own language academy, but, like many of Spain’s minority languages, is still considered endangered by UNESCO.

3. Judeo-Spanish
Judeo-Spanish, also known as Ladino, is a language from Spain that hasn’t been spoken in Spain since 1492, when the Jewish population was expelled from the country by the Spanish monarchs. Though since 2015 their descendants have been able to apply for dual Spanish citizenship, Judeo-Spanish is now mostly spoken in Israel, Turkey and Greece. Because the last time it was used in Spain was over 500 years ago, Judeo-Spanish is a linguistic time capsule, and sounds more similar to Medieval Spanish than modern Spanish. It’s also the only Spanish language to be written in Hebrew script.

Bilingual Spanish-Leonese roadsign (Photo: Iván Martínez Lobo)
Bilingual Spanish-Leonese roadsign (Photo: Iván Martínez Lobo)

4. Leonese
One half of the Astur-Leonese language branch, Leonese descends from the everyday Latin spoken in the geographic area that would become the medieval Kingdom of León. The kingdom’s capital, also called León, or Llión in Leonese, was founded as a military camp and settled by the Roman Seventh Legion. León/Llión, which means ‘lion’ in today’s language, actually comes from the Latin name of the capital’s founders, legio septima gemina, meaning ‘the twin seventh legion’. Despite a flourishing medieval literature, history has not been kind to the llingua llionesa, which is now a UNESCO endangered language that, unlike its other half Asturian, has no official status in Spain.

5. Aranese
Less well known than its sister dialects of Gascon and Occitan in France, Aranese is spoken in the Valley of Aran (Val d’Aran), one of only two areas of Spain on the Northern side of the Pyrenees. Even though it only has around 3,000 native speakers and is used in a small geographical area, Aranese has co-official status in its home region, Catalonia, and is taught as a compulsory subject in schools in the Val d’Aran. You can even pick up some Aranese yourself thanks to the University of Barcelona’s multilingual conversation guide.

This blog post was previously published at the Huffington Post.

Fiction favourites for die-hard linguists

With summer supposedly at its height (calling all semanticists: is the fact that it is July enough to meet the definition of ‘summer’, even if rain, hail, wind, and multiple layers of clothing are featured?) it is time to put down that book on allophony in Gujarati, the processing of possessives in Chhattisgarhi, or whatever linguistic page-turner you’re dipping into. Quite often though, a dedicated linguist cannot stop making language-related observations even when switching off from full-blown research-mode. As a consequence, linguistics literature is generously sprinkled with references to novels, the authors of which were at the time of writing gloriously unaware of their work turning into scientific data. I bring to you the top three works of fiction for linguists.

First up we have Lewis Carroll’s classic Alice’s Adventures in Wonderland – an inspiration for film makers, fantasy lovers, and keen readers of all ages, but also a pragmatist’s wet dream. As Alice plunges down the rabbit hole, she ends up in a world defying the rules of physics and also in something of a nonsensical linguistic wonderland. The Mad Hatter’s tea party is a celebration of not only unbirthdays but also of linguistic rule bending and pragmatic acrobatics. Take the exchange between the March Hare and Alice:

‘Take some more tea,’ the March Hare said to Alice, very earnestly.
‘I’ve had nothing yet,’ Alice replied in an offended tone, ‘so I can’t take more.’ ‘You mean you can’t take less,’ said the Hatter: ‘it’s very easy to take more than nothing.’

The reason for Alice’s confusion is that we tend to take ‘more’ to imply that something has already happened; however, as the March Hare points out, technically more is more even if it’s just more than nothing. Pragmatics brain pain anyone?

Having a very non-pragmatic tea party

Having a very non-pragmatic tea party

If you aren’t feeling the strain of mad tea party communication, perhaps you would enjoy some even more strenuous intellectual effort in the form of James Joyce’s Finnegans Wake. The epic stream on consciousness is written in a mixture of real English words, neologisms, and portmanteau, or blended, words. As such, it’s one for morphologists and phonologists; by putting bits and pieces of the right sound combinations together and throwing in some actual endings, you can get something apparently nonsensical, yet very much English-like.

“Loud, heap miseries upon us yet entwine our arts with laughters low!”

Joyce’s tour de force really is a literary gem, but somewhat on the challenging side of summer reads. As my friend Wikipedia kindly puts it, “Finnegans Wake remains largely unread by the general public.”

Very much the opposite is the case with – *drumroll* *gasp* – The Lord of the Rings. Okay, I know many of us got into the trilogy by staring into the dreamy film-version eyes of Legolas, drooling at the hunky figure of Boromir, or (despite female characters being few and far between, my inner feminist would like to point out) admiring the airy wardrobe of the mysterious Galadriel. But as die-hard fans will know (and who will probably roll their eyes at me pointing out the obvious), Tolkien was something of a language geek, and this shows throughout the linguistic landscape of Middle Earth.

"Really enjoying the Sean Bea... sorry, linguistic aspects."

“Really enjoying the Sean Bea… sorry, linguistic aspects.” Credit: Jason Parrish.

One of the author’s passions was Finnish, as he wanted to read the Finnish national epic Kalevala in the original. I don’t know if Tolkien ever managed to conquer Kalevala without the aid of translations and dictionaries, but his Elvish language of Quenya was certainly inspired by the sounds of Finnish: Mindon Eldalieva (‘Lofty Tower of Elvish-people’) and Oron Oiolosse (‘Ever Snow-white Peak’) resemble Finnish just as Finnegans Wake resembles English. There’s also a healthy dose of Old English squeezed into proper names: Saruman derives from the root searu- (‘treachery’ or ‘cunning’), while Mordor is rather morbidly based on morthor (‘murder’).

And on that cheerful note, I wish you happy linguisticky reading!