What a lot of relatives you have!

English has a lot of relatives. I don’t mean languages to which it is related, but rather relative clauses. I’m only going to focus here on some so-called restrictive relative clauses. An example is given in (1) (the relative clause is underlined).

(1) The wolf that ate grandma was in bed.

In (1), the relative clause helps us to identify which wolf we are referring to, i.e. out of all the wolves in context, we are referring to the one that ate grandma. In other words, the relative clause in (1) restricts the referent of the noun modified by the relative clause, in this case wolf.

There are quite a few types of relative clause which can be used to restrict the referent of a noun. Some of them look quite similar to one another but they behave in slightly different ways as we will see.

First of all, there are relative clauses introduced by relative pronouns (who or which) and those introduced by that. Let’s call them wh-relatives and that-relatives respectively.

(2)   a.  The wolf that ate grandma was in bed.

b. The wolf which ate grandma was in bed.

The noun modified by a wh-relative or a that-relative can correspond to a number of different positions inside the relative clause. In (2), for example, the noun wolf corresponds to the subject of ate. However, it could correspond to the object, like in (3), or the object of a preposition, like in (4), as well.

(3)   a.  The wolf that we saw was in bed.

b.  The wolf which we saw was in bed.

(4)   a.  The wolf that Red Riding Hood talked to was in bed.

b.  The wolf which Red Riding Hood talked to was in bed.

c.  The wolf to which Red Riding Hood talked was in bed.

Some people would say (4b) is not correct because it has a stranded preposition, and that (4c) is the correct version. However, we are interested in what English speakers actually do, not what some people think they should do. Interestingly, if we use a that-relative, like in (4a), we have no choice but to strand the preposition! (5) is not even acceptable to English grammar pedants! (* means unacceptable/ungrammatical).

(5)  *The wolf to that Red Riding Hood talked was in bed.

Big Bad Wolf

English also has restrictive relative clauses introduced by neither a relative pronoun nor that. Let’s call these zero-relatives because there is nothing (zero) visible/audible to introduce them. The noun modified by a zero-relative can correspond to an object or the object of a preposition in a relative clause. Some examples are given in (6).

(6)   a.  The wolf we saw was in bed.

b.  The wolf Red Riding Hood talked to was in bed.

So far, zero-relatives look just like wh-relatives and that-relatives except that the relative pronoun or that is missing. However, there is another difference. We saw in (2) that the noun modified by a wh-relative or a that-relative can correspond to the subject of the relative clause. However, this is not possible when the noun is modified by a zero-relative.

(7)  *The wolf ate grandma was in bed.

In (7), the intended meaning is the one where wolf corresponds to the subject of ate. However, (7) is unacceptable/ungrammatical. To express this meaning, we would need to use a wh-relative or a that-relative instead.

We have seen that a noun modified by a zero-relative cannot correspond to the subject of a relative clause. There are other restrictive relative clauses where the modified noun can only correspond to the subject. These are the so-called reduced relatives.

(8)   a.  The wolf eating grandma has such big ears, eyes and teeth.

b.  The person eaten by the wolf was grandma.

They are called reduced because they seem to be reduced versions of wh-relatives or that-relatives.

(9)   a.  The wolf which/that is eating grandma has such big ears, eyes and teeth.

b.  The person who/that was eaten by the wolf was grandma.

However, various pieces of evidence suggest that the examples in (8) are not the results of bits of (9) being deleted. For example, there are acceptable reduced relatives with no acceptable ‘full’ counterpart. Therefore, reduced relatives are not literally reductions of full relatives.

(10)  a.  The creature resembling grandma is a wolf.

b. *The creature which/that is resembling grandma is a wolf.

Reduced relatives in English are formed using the participle forms of the verb: either the present participle, e.g. eating in (8a), or the passive participle, e.g. eaten in (8b). Even though the passive participle looks like the past participle in English, the evidence tells us that reduced relatives can be formed using the passive participle, not the past participle.

(11)  a.  The wolf has eaten grandma.

b. *The wolf eaten grandma is in bed.

In (11a), eaten is a past participle (not a passive participle). If reduced relatives were formed using the past participle and if the noun modified by a reduced relative can only correspond to the subject of the relative clause, we would expect (11b) to be acceptable. However, it isn’t. This, among other things, tells us that it is the passive participle that is used to form this type of reduced relative.

There is a lot more to say, and we haven’t even mentioned all the types of relative clause that English has to offer! But that must wait for later. If I say anymore at present, I fear you might start to envy grandma.

(No grandmas were harmed in the writing of this blogpost… well, one was eaten, but the rest are fine)

Domains within sentences

One idea that has emerged in modern linguistics is that sentences can be divided into different parts or “domains“, each with its own separate function.

At the core of the sentence are the verb and its “participants” – the nouns or pronouns associated with it. This is called the classification domain, where basic properties of the sentence are classified. However, there’s nothing here about information like when the event took place, or even if it happened at all. You might like to think the content of this domain as something more like the logical representation love(Lucy, Chris), meaning (roughly) “an event of loving with participants Lucy and Chris” – the representation says nothing about whether the event takes place in the past, the present, the future or not at all.


Next   we get the anchoring domain where the event is anchored in the world in some way. In English this domain precedes the classification domain and involves things like tense (often shown by auxiliaries like did) and negation (e.g. n’t, not). Subjects also move to occupy a position in this domain.

 Lucy didn’t love Chris

Some languages don’t use time/tense to anchor their sentences in the world, but other things, like location. For example, the Yagua language (spoken in Peru) has a suffix mu which shows that an event look place downriver relative to the place of speaking. So naadarããyããmuyada means “they two danced around downriver”. Interestingly, all languages seem to require anchoring of some sort.

The next domain, which precedes the other two in English, is called the linking domain. This can, for example, contain elements which link the clause to other clauses: e.g. words like that which mark a clause as a subordinate clause (embedded within a larger sentence):

(I believe) that Lucy didn’t love Chris

Question elements like why or what which link the sentence to the wider discourse also come in this domain, and hence occur toward the start of the sentence in English.

I think it’s very interesting that sentences can be divided up into different domains in this way and believe it has the potential to tell us a great deal about how the human mind works.

Adieu to French? That’s no fait accompli.

Jeremy Paxman’s opinion piece in the FT last week, in which he called the French language ‘useless’, has, unsurprisingly, caused something of a furore.

Articles elsewhere covering Paxman’s piece picked up on such quotable lines as:
‘It is time to realise that in many parts of the world, being expected to learn French is positively bad for you’.
‘The outcome of the struggle is clear: English is the language of science, technology, travel, entertainment and sport. To be a citizen of the world it is the one language that you must have.’

Now, when you come to look at the whole piece (unfortunately accessible only via FT subscription),1 most of it focusses on language policy in La Francophonie, the group of countries in which French is spoken, a hangover of its colonial past. Now that’s mostly too political, historical and economic for a linguist in training like me to get my teeth into. But I would like to comment on a couple of parts of Paxman’s argument, to give the view from linguistics.

A temptation to be avoided?

A temptation to be avoided?

Firstly, one of the reasons he gives for stopping the teaching of French in Francophone countries is the dominance – coup de grâce, even – of English. That’s the only useful language to know, being “the language of science, technology, travel, entertainment and sport.” But wait, not so fast, Mr Paxman. Granted, English has more first language and second language speakers than French (300-400 million first language and up to 1bn second language, compared to 80 million and 220 million), and granted that it is widely used, especially in academia. But I wonder whether it would come as a surprise to know that less than half of internet content is in English, or that around 6bn people, over 80% of the world’s population do not speak English at all? If cross-cultural communication is something we care about, then English is not the only language worth knowing.

Secondly, the application for us Brits seems to be ‘don’t bother with French’. Well, okay, Paxman does make it a bit more nuanced than that: “If you are a native English speaker, by all means learn Chinese or Arabic or Spanish. If you must, study French, because it is a beautiful language. But let us have no truck with suggestions that it is much worth learning as a medium of communication.” Thankfully, he’s not advocating not learning any foreign languages (although you might think the paean to the usefulness of English strongly implies it), and, personally, I might well agree that, given a choice, it would be good to have more people learning Mandarin, Urdu, Farsi or Russian (to take a handful currently required by GCHQ), rather than French. But, given the intertwined sociolinguistic history we share with our neighbours, learning French can be a fascinating way into language learning. Learning one foreign language equips you with metalinguistic knowledge and cognitive strategies that help you when learning another, so as long as French remains the only option, sadly, for some at school in UK, it should not be discouraged. Not to mention the many uses it does still have in business and diplomacy (to take one example, the UK is one of the top 6 foreign investors in Morocco, where French is the business language).

The really unfortunate thing about Paxman’s opinion piece – and of course, he is entitled to an opinion – is that it’s full of pithy pull-outable lines that have the potential to cause much more damage out of context, the worst offender being: ‘the real problem with French is that it is a useless language.’ If you’re calling any language useless, you have to ask ‘for what?’ and ‘for whom?’ It may be that some languages are politically or culturally more strategic to learn for different people at different times, but no language, while still alive, is ever useless – for its speakers, however few or many, it is their means of communication, and therefore incredibly useful.

1. COMMENT Voilà – a winner in the battle of global tongues; Opinion
By Jeremy Paxman, 8 April 2016, Financial Times

Topicalization in Chinese: A game of efficiency and compromise

I was sitting in a Chinese restaurant with my friend Lucas in Paris.

Wǒ zhèzhǒng shūcài zuì xǐhuān le (I, this kind of vegetable, like the best)!” I couldn’t help yelling out my excitement on seeing the appetizing hot-pot vegetables on the table.

Wait! What did I say? A soft but firm voice came up in my head. Yes, I had just uttered a somehow “weird” sentence in Mandarin Chinese (my native language). It’s weird because Mandarin is assumed to be an SVO language. That is, the object usually comes after the verb. For example, “I like Julio” in Mandarin is wǒ xǐhuān Julio (I-like-Julio) instead of *wǒ Julio xǐhuān (I-Julio-like) (star=ungrammaticality).

However, after pondering for a while, I decided to accept my weird utterance, because I realized this was one of those “ineffable” situations. There was simply no way to express my excitement and obey the grammatical rules at the same time! Actually (1) is the standard way of saying “I like this kind of vegetable the best” in Mandarin, but I can hardly think of any scenario where I would really use it without sounding too textbook-ish.

(1) Wǒ     zuì      xǐhuān  zhèzhǒng    shūcài          le!

I              most   like       this-kind    vegetable    LE

“I like this kind of vegetable the best!”

So what happened to make poor me utter weird things in a Paris restaurant? Was it because I was too hungry? Not necessarily. Similar sentences are produced by Chinese speakers all the time! For example (ASP=Aspect marker, SFP=Sentence final particle),

(2)   a.  tóu xǐ le ma? (you-head-wash-ASP-SFP; have you washed your hair?)

 b.  Wǒ zuòyè xiěwuán le. (I-homework-finish-ASP; I have finished my homework.)

 c.  Xiǎohóng qiánbāo diū le. (Xiaohong-wallet-lose-ASP; Xiaohong lost her wallet.)

Actually, linguists wouldn’t find such sentences too weird, because what’s involved here is not really SOV ordering, but rather topicalization (=making something the topic of a sentence). So far so good. But what struck me on that hot-pot day was how frequently we actually use topicalization in our daily life. The answer is “a looooooot”! Nowadays it has become a language routine rather than a stylistic alternation.

Yet you may wonder: what’s the price of topicalization? Well, good question! The price is that sometimes we get confused by ourselves! For example, since elementary school I’ve been wondering how to say the following sentence in a nicer way:

(3) Wǒmā      xiǎoshíhòu    zǒngshì      dǎ     wǒ.

my-mom       as a child    always       beat  me

“When I/my mom was a child, my mom/she always beat me.”

My mom always beat me when I was a child! (by Bokai Huang)

My mom always beat me when I was a child! (by Bokai Huang)[1]

Fortunately, this isn’t true for me, but after so many years I still don’t understand why my compatriots would produce sentences like this. Maybe this is another compromise between expressiveness and grammaticality (in a loose sense), as all the clearer and unambiguous versions of (3) simply sound equally bad (if not worse), as in (4).

(4)   a.  ?Xiǎoshíhòu wǒmā zǒngshì dǎ wǒ. (as a child-my mom-always-beat-me)

b.  ?Wǒ xiǎoshíhòu wǒmā zǒngshì dǎ wǒ. (I-as a child-my mom-always-beat-me)

So why is it so difficult to say what we mean? Because (3) not only involves topicalization (this time it is the subject that gets topicalized), but also has an embedded null subject (Oops, Chinese is one of those radical pro-drop languages, where things like subject and object can be happily and wildly omitted). The problematic chunk xiǎoshíhòu is actually part of a phrase (dāng/zài) XX xiǎoshíhòu “(when) XX was a child”, e.g.

(5) a. Míngyuè xiǎoshíhòu xǐhuān hē chá.

“When Mingyue was a child, she liked drinking tea.”

 b. Qíng’er xiǎoshíhòu yǎng guò yìzhī xiǎogǒu.

“When Qing’er was a child, she had a pet puppy.”

 c. Liánlian xiǎoshíhòu zǒngshì kǎo yìbǎi fēn.

“When Lianlian was a child, she always got 100 marks.”

So, (3) can have either (6a) or (6b) as its underlying structure (parentheses=being dropped or deleted).

(6)   a.  [Topic wǒmā   [A (zài wǒ) xiǎoshíhòu    [B zǒngshì    [vP (wǒmā)   dǎ wǒ ]]]]

my mom                          when I  as a child       always         my mom     beat me

“My mom, when I was a child, always beat me.”

b.  [A wǒmā      xiǎoshíhòu [B zǒngshì  [vP (wǒmā)        dǎ wǒ ]]]

my mom            as a child        always       my mom      beat me

“When my mom was a child, she always beat me.”

(Technical details like displacement are omitted. Simply treat A/B as two chunks adjoined to the verbal core vP “(my mom) beat me”, which assumes the basic word order SVO.)

Of course, (6b) is against most people’s real-world knowledge, because when “my mom” was a child, “I” probably didn’t exist at all! But we’re living in a curious world, and one of the most fascinating characteristics of natural languages is precisely their capacity of expressing even the least possible things. Therefore, although (6b) is pragmatically marked, it’s grammatically well-formed.

So, with all the imperfections of topicalization (as in my sudden enlightenment in the hot-pot restaurant and the imaginary world where poor kids are abused by their child-moms), why do we still love it so much?

Well, like I said, it’s a compromise between expressiveness and grammaticality (still in a loose sense). In real-life communication, language first and foremost serves to express meanings and emotions. So, who wins in such a game of efficiency and compromise? Mostly expressiveness, especially in colloquial language. This is also one of the biggest differences between colloquial language and “ideal” (or less ideally, written/textbook) language. For example, in the latter register, (3) may well be yielded in a much nicer way as (7).

(7)   Zài   wǒ   xiǎodeshíhòu,  māma  zǒngshì  bùfēn qīnghóngzàobái de   dǎ      wǒ.

when       I      as a child         mom    always   indiscriminately                beat   me

“When I was a child, my mom always beat me without clear reasons.”

(7) is not only nicer from a grammatical perspective, but also more natural on a narrative level. However, real life isn’t story-telling, and speaking like (7) all the time can be hard work (probably not for literature lovers). Hearing such sentences constantly can also be exhausting— they’re simply not proper for the colloquial register.

Abstracting away from the register issue, a more technical problem facing linguists is what forms part of the ideal language (I-language) and what reflects real-life compromises. The former aspects are significant in revealing the essence of our language instinct, while the latter aren’t of as much evidence to this end. In linguistics jargon, this is a question of Competence vs. Performance. But as we have seen, the boundary between the two is often blurred in the data we have access to (it’s a pity we can’t directly see through speakers’ minds).

Last but not least, the vegetable I was excited about in the hot-pot restaurant was “needle mushroom” (jīnzhēngū)! It’s the best, especially with beef!

Needle mushroom with beef (Jīn Zhēn Gū Féi Niú) (source: http://261925957.blog.sohu.com/307124194.html)

Needle mushroom with beef (Jīn Zhēn Gū Féi Niú)[2]

Picture sources:

[1] http://weibo.com/huangzhigaojian?from=profile&wvr=6

[2] http://261925957.blog.sohu.com/307124194.html

So what is it you do?

“So what is your PhD about?”

A pause in the conversation, a heavy silence, and eager anticipation of an easy-to-grasp answer.

“Discourse-configurationality in Finnish and Japanese and its repercussions to the Minimalist architecture of syntax. Y’know.”

The exact formulation of my research topic is not exactly conducive to small talk. More often than not, it makes the conversation engine cough and jerk, finally coming to a halt at levels of iciness comparable to the initial interaction between Mr Darcy and Elizabeth Bennett in Pride and Prejudice.

Talking about my research - as icy as Mr Darcy. norika21.

Talking about my research – as icy as Mr Darcy. norika21.

To avoid the premature death of all small talk – and to boost my cool and hip student factor – I have discovered that the way to go is to keep things nice, simple, and very much digestable to the uninitiated. Let’s try again.

“Well, I look at look Finnish and Japanese.”

“Oh that’s a very interesting choice of languages. How did you come up with that?”

Well, let me tell you.

Finnish and Japanese are very much historically unrelated, but they both show some curious phenomena, among them relatively free word order. In Japanese, the verb must come last in the sentence but otherwise constituents are basically free in their ordering; Finnish is somewhat more constrained but words can still be moved around much more freely than, say, in English. To illustrate, in English The dog ate the cat and The cat ate the dog mean very different things. In Japanese, however, changing the order of the subject and the object preserves the state of affairs:

Neko-wa inu-o tabemashita.
cat-top dog-acc ate
“The cat ate the dog.”

Inu-o neko-wa tabemashita.
dog-acc cat-top ate
“The cat ate the dog.”

(-wa is a topic marker that marks the phrase the sentence is about; -o is an accusative marker marking the object. Similar grammatical functions will come up in the other examples as well; all you need to know is that having these is part of the reason why the sentences above can have the same basic meaning even when the word order changes.) And the same holds in Finnish:

Kissa söi koira-n.
cat ate dog-acc
“The cat ate the dog”

Koira-n söi kissa.
dog-acc ate cat
“The cat ate the dog.”

Prepare to eat! Or to be eaten..? A_Peach.

Prepare to eat! Or to be eaten..? A_Peach.

“Ah okay… So what exactly are you doing with this?”

A bit of theoretical machinery first. In syntactic theory, there is a notion of movement. Think of a wh-question in English (these are questions formed with so-called wh-words such as who, which, where, how (I know, I know, no wh in the spelling there), and so on):

What did Easter Bunny hide?

The question word what serves more than one function here: on the one hand, in the sentence-initial position it alerts the listener to the fact that the sentence to follow is a question, and on the other, it is the object of hide. To capture this, it is assumed that what in fact starts off in a position after hide, so that at some level of representation, the sentence looks like

Easter Bunny hid what.

Interestingly, this is exactly what you hear in echo questions:

A: Easter Bunny hid bottles of liqueur.

B: Easter Bunny hid what?!?

To cut many theoretical corners, the idea is that what moves up in the structure to where we hear it, but it is also represented silently at its original position. What makes it move is assumed to be a feature higher up in the sentence: in this case, a so-called wh-feature, which, if checked by moving a wh-word to it, makes the sentence a question.

Easter Bunny is WHO?!? Eric Mueller.

Easter Bunny is WHO?!? Eric Mueller.

“Okay… So what about Finnish and Japanese?”

I said that there is a wh-feature that triggers the movement of the wh-phrase in English. But, as in life, nothing is ever nice and simple in linguistics either. Some linguists argue that purely pragmatic notions (basically things that don’t affect the truth of a statement) can’t have corresponding formal features in the syntax. Now, whether something is a question or not is obviously not only a matter of pragmatics, so having wh-features is not a problem. However, in Finnish you seem to be able to move phrases much like wh-phrases in English but for purposes of contrast. To illustrate:

Sofia nai prinsessa-n.
Sofia married princess-acc
“Sofia married a princess.”

Prinsessan Sofia nai.
princess-acc Sofia married
“It was a prince Sofia married (and not a prince).”

The latter utterance has a contrastive reading unlike the former, but this difference is difficult to pin down in non-pragmatic terms. The question that has to be asked, then, is whether this movement is in fact different from that in the case of wh-phrases, and if not, whether postulating a feature for contrast is necessary.

A case of contrast: marrying a princess, NOT a prince! Ross Hawkes.

A case of contrast: marrying a princess, NOT a prince! Ross Hawkes.

“I think I just about get this… Is Japanese the same then?”

I wish. Japanese offers different sort of complication to linguistic theory. It is often assumed that movement doesn’t just happen without a reason: it has to have some sort of interpretive effect, semantic or pragmatic. Japanese has a phenomenon called scrambling (fancy, I know), where nearly any phrase can be moved nearly anywhere in the sentence, or even out of it. Have a look at these examples:

Zen’in-ga sensei-ga syukudai-o dasu to omowanakatta (yo)
all-nom teacher-nom homework-acc assign that think part
“All did not think that the teacher would assign homework.”

Syukudai-o zen’in-ga sensei-ga dasu to omowanakatta (yo)
homework-acc all-nom teacher-nom assign that think part
“Homework, all did not think that the teacher would assign.”

The object syukudaio ‘homework’ starts off in the that-clause, as in the first example, and ends up in the main clause, as in the second one. In cases like this where the moved phrase crosses a clause boundary, it’s argued that there is no difference in meaning. So, what has to be done is to try to tease apart even slight differences in interpretation. If any appear, the questions will be much the same as with Finnish contrast; if not – well, that’ll be a more complicated story of reassessing our theoretical assumptions.

“Gosh, I never thought you could achieve so much by looking at such distant languages, and that linguists are such cool people!”

Language change as waves, contagions and hierarchies

In historical linguistics, we pay a lot of attention to the mechanisms of language change in terms of languages as systems. We try to explain how a change may first have arisen by looking at other facts about that language. For example, we might explain the change in the language of many English speakers whereby ‘th’ is pronounced like ‘f’ (saying ‘fink’ for ‘think’, etc.) by pointing out that these two sounds are acoustically similar and that this may have led to them being confused by children learning the language. I wrote about this sort of explanation in a previous blog post.

But there’s a second layer of explanation to be done. The very idea of ‘languages as systems’ is an abstraction. There’s no such thing as ‘English’, a single entity: instead, each speaker with some level of English proficiency (perhaps 840,000,000 people according to Wikipedia) produces language which has a lot in common but some differences. So for English as a whole to change—or just the English of the UK, or the English of New York, or even the English of a single village—the newly minted pronunciation (or word, or phrase, or piece of grammar) has to spread from the first person who produced it to other people.

In historical linguistics and in sociolinguistics we distinguish two different flavours of this process of spread: ‘transmission’, where the new form is learnt by young children as they acquire the language for the first time, and ‘diffusion’, where the new form is passed among adults.

Early stages of a change spreading by contagious diffusion/wave model

Early stages of a change spreading by contagious diffusion/wave model

Late stages of a change spreading by contagious diffiusion/wave model

Late stages of a change spreading by contagious diffiusion/wave model

Our ideas about how diffusion works have traditionally been based on historical dialect studies. Huge survey-style dialectology projects in the nineteenth and twentieth centuries created maps of differences in traditional dialects, especially in German- and English-speaking Europe. These showed that new forms appeared to have spread continuously outwards in patterns that resembled the ripples produced by a stone dropped into water. A recent new form would be found in a single, connected area. An older form might have spread everywhere except for a few small regions which stood out as conservative islands. This idea of change spreading outwards continuously is sometimes described as the ‘wave’ model.

However, later in the twentieth century, studies of ongoing changes which were still diffusing actively through the population found a different pattern. Here it was found that, instead of spreading continuously across the map, changes tended to start in the largest city in a region and then proceed to ‘jump’ from city to city without ever being found in the intervening countryside. Having spread to progressively smaller cities, the change would then start to spread out from these to the surrounding rural areas. This observation led to a revised suggestion that there were two possible patterns of diffusion: the ‘contagious diffusion’ observed in historical studies, where a change spread continuously across space, and ‘hierarchical diffusion’, where the change spread down a ‘hierarchy’ of increasingly smaller settlements.

Early stages of a change spreading by hierarchical diffusion

Early stages of a change spreading by hierarchical diffusion

Later stages of a change spreading by hierarchical diffusion

Later stages of a change spreading by hierarchical diffusion

Finally, in much more recent research, a third, rarer pattern has been identified. Changes can apparently sometimes first spread throughout a rural region, then into smaller towns, and only then finally into cities. This pattern, the mirror image of hierarchical diffusion, has been labelled ‘contra-hierarchical diffusion’.

So why do different changes diffuse in different ways? Perhaps we need another reminder not to think exclusively in terms of big abstractions. I’ve written here about changes being found in particular locations and about changes spreading across space. But language doesn’t actually exist in physical space. Really what we’re talking about is not changes spreading to particular places, but changes spreading to the language of people who live in particular places.

Changes can clearly only spread between people when those people talk to one another. So when changes spread continuously across space, that must reflect that people are more likely to know and talk to people who live near to them. Really, what we’re seeing is that changes spread continuously through social networks—and those social networks, for very obvious practical reasons, mostly reflect the physical reality of where people live and work.

Once we remember this, hierarchical diffusion also becomes easy to explain. If we compare the modern era to any historical period we find very different patterns of population movement and communication. With public transport and cars people habitually travel much further to work and study. They also relocate more often and move much greater distances when they do. With these factors and electronic communications, they keep in more regular touch with people living far further away than ever before. And cities are crucially important to all these processes: people commute in and out of cities much more than between rural areas and they are more likely to migrate to cities for work. All this means that people’s social networks have much less to do with the geography of continuous space than ever before. Most people are communicating regularly with people who live much, much further away from us than our ancestors ever did—and such long-distance contacts particularly connect cities.

Given all that, we really shouldn’t be surprised to find that changes tend to spread first between cities and only later to the surrounding countryside. In fact, this isn’t a different process to that of contagious diffusion at all! Both are really just the process of changes spreading through people’s social networks.

So what about contra-hierarchical diffusion? This is a little harder to explain. The best explanation here is probably to do with the social meaning that speakers ascribe to changes. In regions where there is significant inward migration especially into urban areas, local speakers may actively participate in championing distinctively local ways of speaking in order to differentiate themselves from newcomers. As cities have historically been involved in hierarchical diffusion process, it is rural regions that are most likely to still preserve distinctively local forms. As a result, a drive to speak in a more local way (and thus express that one is a ‘real’ local person and not an outsider) will tend to cause rural forms to spread to urban areas.

Silently the context talks

Although I have been sort of away from theoretical semantics and philosophy of language for quite a while, some fascinating phenomena of human language use in this area still attract me now and then, which always leads me back to the time when I was still considering how people talk on different occasions. This time the story started when I was watching a Japanese news programme the other day (for those who are curious, it is the Monday version of News Zero by Nippon TV, and the newscaster is Sho Sakurai). I was a fan of that newscaster even before I started watching the programme; to give some brief background information, he grew up in Tokyo, speaking Standard Japanese and is now in his thirties. I have previously watched some interviews with him as well as some TV programmes his group host, in which he usually uses boku and ore talking to the senior hosts and other members of the group, respectively. Considering the delicate system of Japanese first person pronouns (Ide 1982), I would say that the two first person pronouns are very common for a person of his age: when talking to seniors, boku is usually used to show humbleness and politeness, while ore is often used to juniors and people around the same age, in order to emphasise intimacy as well as masculinity.

Therefore, I was somehow shocked when I heard Mr. Sakurai using watashi referring to himself in the news programme – both in the VCR section as a narrator, and in the news programme as a commentator. Compared with boku and ore, watashi is a rather formal pronoun with a gender neutral feature, and I never expected him to utter this pronoun on TV. What is more surprising, after the news programme finished, I found a clip of a radio programme by JAL (an in-flight music programme) recorded by the group in 2014, in which he introduced himself to the audience using watakushi, one of the most formal pronouns, usually characterised by its use by the Royal members and noble celebrities. In a short period of time, the same person made use of four different first person pronouns, and it seems that the most significant trigger of these usages is the occasion of broadcasting – and to use the technical term, the context.

This being one of the longstanding topics in pragmatics and the philosophy of language, a number of definitions of “context” have been proposed and debates have continued for decades, the different definitions emphasising different things in the analysis of utterance meaning. Indexicals, the linguistic elements whose referent may change in accordance with changes in the surrounding environment, were a major aspect in the initial discussions of context. To give a simple example, the meaning of “I am here today” will change when it is said by different people, at different locations, or at different moments. For an utterance containing indexical expressions, before we proceed to understand the meaning of the utterance, we must resolve what the referents of the indexical expressions are, and this is where context starts to talk.

One of the most influential works on indexicals and context, David Kaplan’s Demonstratives (1989), suggests that the context of utterances can be reduced to a set of parameters. Each parameter provides a specific part of information to construct the referent of an indexical expression; for instance, a time parameter provides information for temporal indexicals like “now” and “today”, a location parameter assigns a referent to “here”, and an agent parameter helps fix the referent of “I”. All the indexical expressions have fixed characters (for the distinction between character and content in the sense of Kaplan’s argument, please see here), and one major role of context is to select the correct referent in the context according to the characters of the indexical expressions. The rest of the semantic composition and the derivation of implicature all rely on the first step, so context begins its work even before we understand the meaning of an utterance.

Kaplan provides a metaphysical view of context with a clear illustration of one of the most important roles of context. That does not mean that context can only resolve the problem of indexicals, though. In real language use, context can affect aspects of utterance meanings and manners other than the reference of indexical terms, and, if we take into consideration all the influences contexts have on utterance meanings and manners, the content of context will not be limited to the parametric framework of context promoted by Kaplan. In other words, Kaplan’s framework is not sufficient to cover all the contextual influence on language, in spite that it is a rather comprehensive and well-organised framework. We can observe this influence in the case I mentioned at the beginning of this post, which is a typical example of context influencing honorifics (for a brief introduction to cross-linguistic honorifics, please see section 2.2.5 of Levinson 1983): all the four versions of Japanese first person pronouns are used by one single person, and their referents are all the same when they are interpreted as indexicals in Kaplan’s framework, but the different surrounding environments lead to various choices of pronouns. It seems that the contextual parameters about the speaker, time and location of the utterances do not play an important role here, but inevitably the different choices of first person pronoun comes from the context; to be more specific, the occasion of the utterance. Although on different occasions the audience will be slightly different, which indeed affects the parameter of interlocutor, we can still imagine the following scene: when talking to his colleagues and mentioning himself in the news programme, Mr. Sakurai will use watashi, but after the news finishes, he will switch to boku again – and I actually noticed such switches when one of his senior colleagues was invited to his TV show. The choice of different honorific and the general selection of lexical items takes place also when a topic is presented in different genres (e.g. spoken vs. written, literary vs. non-literary). The occasion of utterance, as an essential part of conversational context, hovers above our heads and controls the words we use.

It is not only first person pronouns, or even the choice of lexical items, that can be influenced by the general context. We can compute different implicatures (the “hidden meaning” of an utterance) in different contexts even if the same sentence is uttered, and in some extreme cases, an explicit meaning can even be cancelled. Although Kaplan’s parametric framework may fail to account for these phenomena, we can see the pervasive influence of context. I recorded an example on metaphor when I was still working on the theory of metaphor: in the tribe of Brazilian Bororo Indians, the male participants of a ritual uttered pa e-do nabure (“we are parrots”) while they were dressed in colourful feathers, and when they were interviewed it seemed that the tribe members believed they were parrots at that time (for a comprehensive discussion of the case study, please see chapter 1 of Leezenberg 2001). This example was once used by Durkheim and Mauss (1963) to argue that members of a primitive culture are not able to distinguish between men and animals, nor between literal and figurative languages. The logic is as follows: generally, the meanings derived from metaphorical utterances are explicit and not cancellable, and there should not be any obstacle for a language user to accept a metaphorical interpretation if there is one available; therefore, if a language user does not accept the metaphorical interpretation, s/he lacks the ability to understand metaphors. However, if we take the special occasion of the utterance into consideration, the conflict simply disappears between accepting “we are parrots” as a literal utterance and the ability of understanding figurative language: one can still hold the two beliefs together, and the only motivation for interpreting “we are parrots” in a literal way is the scene of the ritual, which “de-metaphorises” the whole utterance at that particular moment.

If we make a comprehensive list of the different layers of contextual information, we will find that more components are involved in language processing and the derivation of utterance meanings. Even if the information encoded in the utterance is not sufficient for us to derive the statement made by the speaker, we can still borrow information from the context: from information in our memory, from the co-text that was uttered by the speaker a moment ago, from the environment in which we hold the conversation, from the culture in which we form the behaviour of conversation, and so on. If one extracts a piece of sentence out of the context, it may be confusing and even misleading; in fact, this is a widely used trick by some tabloids to attract readers’ attention. When we are speaking to each other, it is not only us who are talking, but the context is talking too, silently.


For more information, please check the following articles:

Durkheim, Emile, and Marcel Mauss. 1963. Primitive Classification, trans. by Rodney Needham (Routledge)

Ide, Sachiko. 1982. ‘Japanese Sociolinguistics Politeness and Women’s Language’, Lingua, 57.2: 357–85

Kaplan, David. 1989. ‘Demonstratives’, in Themes from Kaplan, ed. by Joseph Almog, John Perry and Howard Wettstein (Oxford: Oxford University Press), pp. 481–563

Leezenberg, Michel. 2001. Contexts of Metaphor (Amsterdam: Elsevier)

Levinson, Stephen C. 1983. Pragmatics (Cambridge: Cambridge University Press)

Lots of words and things you can to with them

There have already been a few posts on this blog relating to corpus linguistics. To recap, a corpus is a (usually very sizeable) collection of texts which can be useful for certain types of linguistic analysis.

Admittedly, the school of linguistic thought which follows in the footsteps of Noam Chomsky has tended to place more importance on linguistic competence (what speakers know about what is and isn’t possible in a language) than performance (what they actually do with language). That is, more importance is given to the questions like “Would this sentence be grammatical?” than “Has anyone ever actually used this sentence?” The fact that nobody has ever previously produced the sentence There is a cacophonous allosaurus in the echoing corridor under the model of King’s College Chapel which is made entirely of string is considered of secondary interest to the fact that someone legitimately could produce it if they so desired.

But of course all a corpus can tell us is what sentences people do produce, and reveals nothing directly about what is and isn’t grammatical. (Indeed, they may even include a good number of sentences that everybody would agree are definitely not grammatical!) But that doesn’t mean they tell us nothing whatsoever about competence and grammaticality. For example, the fact that the phrase working away occurs 51 times in the 100 million word British National Corpus but arriving away doesn’t occur at all may suggest sentences like Lucy was working away are grammatical whereas ones like Lucy was arriving away aren’t – something which can be confirmed in other ways. In the context of historical linguistics, where we don’t have access to native speakers to ask directly what is and isn’t allowed in their language variety, this sort of frequency analysis becomes a major source of evidence.

100 million words may sound like a lot, but actually it can be quite limiting: a lot of stuff that we might be interested in just doesn’t turn up that often. For instance, most people agree that outswim (in a sentence like Lucy outswam Chris in the race back to the beach) is a real word, but it only occurs once in the entire BNC. Nowadays we can get around this problem to a certain extent by using Google to search the World Wide Web, giving us access to around 50 billion webpages, and many more words than that (outswim comes up 223,000 times). But there are problems with using Google, too, for example that it might not give us a very balanced mix of different discourse types (as traditional corpora might aim to do), that the Web contains a lot of material produced by non-native speakers of English, or that some of the things a search throws up might not even be produced by humans at all!

Another resource provided by Google is the Ngram Viewer. (An “Ngram” is a sequence of N items, such as words: e.g. “this is a short sentence” is a 5-gram.) The Google Ngram Viewer is based on a huge corpus of books going back hundreds of years; I personally have spent far longer than I really should have done playing around on it. It can be used to demonstrate things about how language changes over time, for instance the relative frequency of the word has and its older equivalent hath:

We can also use the Ngram Viewer as a source of information on other cultural trends, through their influence on language. The following graph of the frequency of the word railway, for example, seems to correlate with the changing role of railways throughout history: really starting to take off in the 1840s, and then going into decline in the twentieth century with the invention of the private motor car as a rival form of transport:

Not all of this is necessarily of much direct interest to someone who wants to focus only on linguistic competence, of course. But language isn’t just an abstract thing in our brains; it’s something used by real people in daily life, and the ways in which it is used are as valid an object of study as the make-up of our mental grammars. And corpora can be very useful indeed in telling us more about the ways in which languages are used.

Can you C what I mean?

A (belated) happy mother language day! If you missed it yesterday, you can catch up on the what and the why here.

One family of languages that could never be counted as mother tongues are programming languages. Yet various US states are considering allowing coding classes in schools to count alongside Spanish, Chinese or Italian lessons towards foreign language learning requirements. Last week, as a bill with this kind of suggestion was being debated in Florida, the popular linguistics writer Gretchen McCulloch was asked how natural languages differ from programming languages (and so why this is a bad idea).

Here, with quite a bit of help from my software engineer husband, I consider some more differences, as well as similarities, between programming languages and natural languages.

1. First up, syntactic ambiguity. As Gretchen McCulloch mentioned, natural languages like English are often syntactically ambiguous. What do we mean by this? Take the following examples:

  • A boy climbed every tree.
    > There was a boy and that boy climbed every tree (i.e., one boy did lots of climbing).
    > For every tree, there was a boy that climbed it (but not necessarily the same one).
  • I’m not going to give a talk in London on Thursday
    … I’m going to attend a talk
    … I’m going to give a talk in Brighton
    … I’m going to give a talk in London on Friday
  • The girl saw a man with a telescope

That is, there is more than one possible mapping from the surface form to the meaning of the utterance. Now, in natural languages, the context, as well as prosodic cues like stress in speech, allow us to disambiguate the intended meaning fairly easily. In contrast, as Gretchen McCulloch says, “formal languages don’t want you to do that.” Indeed, most programming languages have a perfect form-function mapping between syntax and semantics. So, more properly, they don’t allow you to do that. However, most programming languages do allow grammatical structures which are, on the face of it, ambiguous. Consider the following sentence of English:

If it’s raining tomorrow, then if I need to go shopping, I’ll take the car, otherwise I’ll go on my bike.

Admittedly, it’s fairly unlikely that someone would construct a sentence like that in spontaneous speech. But assuming they did, then the listener hits the problem of how the ‘otherwise’ clause resolves – is it attached to ‘if it’s raining tomorrow’, or to ‘then if I need to go shopping’? In other words, what happens when it’s raining but I don’t need to go shopping, or if I need to go shopping but the sun is shining? Programming languages, lacking stress and pitch, resolve these syntactical issues by precisely defining how the sentence is interpreted, with languages typically resolving the “dangling else” by attaching it to the second “if”.
As well as with the syntax, some programming languages include whole other classes of ambiguity, such as features of Haskell (type inference), C++ (template resolution), and Java. Unlike a human listener who uses context to work out what the speaker meant, the compiler simply throws an error when it meets it; the programme has to specify how the structure is meant to be disambiguated.

2. Secondly, and more briefly, implicated meaning. And of a particular sort: in natural languages, speakers can convey meaning not only through what they say, but also in how they say it – the forms that they use. For example, saying ‘Might I possibly ask you to close the window?’ conveys not only a request but also the fact that the speaker is being polite and respectful. Similarly, if I tell you that ‘yesterday Bob was driving along when, suddenly, he caused the car to stop’, you wonder if he pulled the handbrake or hit a pothole (or even a tree), otherwise I would have told you ‘he stopped’.

In programming languages, just like in natural languages, there are usage conventions. However, these are for the benefit of the human reader, not for the computer. A software engineer might look at some code and infer something about its style, what kind of experience the programmer has, and so on, but this isn’t part of the communicative act – the compiler, who plays the part of the interlocutor, doesn’t care about any of those things.

3. Thirdly, linguistic change. This is a characteristic of natural language that all speakers are aware of. Often, this comes in the form of language pundits who bemoan the use of like as a quotative or the singular gender-neutral use of ‘they’, or the many other ways English (or any other language) is thought to be going down the drain. Language change is inevitable, and happens not only at the level of word meanings, but also sounds and syntactic constructions. It happens gradually over time as children acquire language from limited input, and as speakers use language and interact with speakers of different varieties and languages.

Language change happens in programming languages too. However, the kind that most closely parallels natural language change is change in usage, not in the grammar or lexicon: for instance, programmers might notice that a particular construction that was allowed by a language but not really used very often, is actually more useful than they initially thought, and start employing it more. Changes to the grammar or lexicon, though, are decided by committee (for instance, Java 8 now allows the kind of ambiguities we were talking about earlier) – after all, when you only have 15 words in your language, changing the meaning of one is a pretty big deal. And of course, such conscious en masse decisions are something very rare and usually ineffective in natural languages.

You can think of many other ways that Java differs from Javanese, Python from Tok Pisin, and Swift from Spanish – and I may revisit the theme in a later post. But the fact that we’re never going to celebrate C on International Mother Tongue Day perhaps points to the most fundamental difference that means sacrificing natural language learning for coding isn’t going to be a wise move.

Further joys of case and agreement!

In a previous post, I described how the case-marking of the direct object in Kashmiri depends on the person of the subject and the object. This means that in order to determine case on the object, we have to know the result of agreement between the subject and the verb, as well as the agreement between the object and the verb.

In this post, we explore a different kind of interaction between agreement and case-marking. Rather than agreement determining case-marking, we are going to look at the effects of case-marking on agreement, in a few languages that are relatives of Kashmiri, namely Hindi and Nepali.

These are so-called “split ergative” languages: if you’ve been following this blog, you’re all familiar with the term ergativity, but here’s a little recap anyway. In accusative case systems, like English, the object of a transitive clause is case-marked. In (1), the object appears in what we can call object case (them rather than they). The subject is in subject case, whether there is an object or not.

(1) a. She sees.
    b. She sees them.

In *ergative* languages, it is the subject that gets case-marking, and not the object. In ergative fake-English, (1) would be as in (1′), where the *object* in (1’b) is in the same case as the subject in (1’a), and the subject in (1’b) gets another case.

(1') Ergative fake-English
     a. She sees.
     b. Them sees she.

In split ergative languages, like Hindi and Nepali (and Kashmiri), only some constructions are ergative: when the verb is in the perfective aspect (roughly, indicating a completed action), the clause is ergative. When the verb is imperfective, the clause is accusative. This is shown in the following Hindi examples.

(2) Hindi
    a. Rahul   kitaab    paṛh-taa          thaa.
       R.MASC  book.FEM  read-hab.MASC.SG  be.MASC.SG
       ‘Rahul used to read a/the book.’
    b. Rahul-ne  kitaab    paṛh-ii   thii.
       R-ERG     book.FEM  read-FEM  be.FEM.SG
       ‘Rahul had read the book.’    (Bhatt 2005: 759)

So far, so cool. Let’s have a closer look at these examples, now. The ergative case-marker in Hindi is -ne, which we see in (2b). We can also see that the object is not case-marked in either sentence.

We can also check what the verb does: in Hindi, verbs can agree with arguments in person, number and gender. The glosses of (2) shows us that in (2a), the verb agrees with the masculine subject, while in (2b), it agrees with the feminine object. How come? The difference seems to be that the subject is case-marked in (2b), but not in (2a). Can we test for this? You bet we can!

(3) Hindi
    Mona-ne       is    kitaab-ko     paṛh-aa.
    Mona.FEM-ERG  this  book.FEM-ACC  read-MASC.SG
    ‘Mona had read this book.’        (Bhatt 2005: 768)

In (3), both the subject and the object are case-marked and both are feminine. So what happens with the verb? It shows masculine agreement! This kind of agreement arises as a default, when the verb does not know what else to agree with because all arguments have case-marking!

To summarise: it looks like the verb in Hindi agrees with arguments that do not have case-marking, and otherwise shows third person singular masculine agreement.

The closely related language Nepali shows a similar pattern, but differs in a very interesting way. Look at the examples in (4):

(4) Nepali
    a. ma       yas   pasal-mā  patrikā    kin-ch-u.
       1SG.NOM  this  store-IN  newspaper  buy-NONPAST-1SG
       ‘I buy the newspaper in this store.’
    b. maile    yas   pasal-mā patrikā    kin-ẽ.
       1SG.ERG  this  store-IN newspaper  buy-PAST.1SG
       ‘I bought the newspaper in this store.’
       (Bickel & Yādava 2000: 348)

Case-marking is similar to Hindi: in (4a), the subject is unmarked and in (4b), the subject has ergative case. This difference, as in Hindi, is tense and aspect. What happens with agreement? The verb agrees with the subject in both examples, independently of its case-marking! In this sense, Nepali and Hindi differ: in Nepali, the verb can agree with ergative subjects, but in Hindi it cannot.

One way of characterising this difference is to say that Nepali and Hindi differ in their agreement alignment. Both languages have (split) ergative case-marking (or alignment): the transitive subject gets case-marking.

In Hindi, the verb agrees with the subject in intransitive clauses, and with the (unmarked) object in transitive clauses (because of the subject’s case). We can call this ergative agreement alignment.

In Nepali, the verb agrees with the subject in both intransitives and transitives, independently of the subject’s ergative case. This agreement pattern resembles the English one shown in (1): the verb always agrees with the subject. We can call this accusative agreement alignment.

We can then group English, Hindi, and Nepali as follows: English has accusative case and agreement alignment, and Hindi has ergative case and agreement alignment. But Nepali mixes alignments: it has ergative case alignment, but accusative agreement alignment.

The attentive reader will notice that there could be another way of mixing alignments: accusative case alignment (like English), but ergative agreement alignment (like Hindi). As Bobaljik (2008) argues, however, no such language seems to exist! This means that languages don’t just randomly vary in their case and their agreement patterns, but that this variation is highly systematic.

Bobaljik also proposes a way to capture this. He suggests that agreement is always determined by case-marking, but that languages differ where they make the cut (e.g. Hindi doesn’t allow agreement with ergatives, but Nepali does). Bobaljik captures the general pattern by suggesting that agreement proceeds along a hierarchy:

(6) no case-marking > ergative/accusative case > dative case

Bobaljik’s generalisation is that if a language allows agreement with arguments on one level, it will also allow agreement with elements higher in (6). For Nepali, this means that if ergatives can agree, so can arguments without case-marking.

And now for the highlight! On the assumption that the verb checks the subject first and the object second (2), this generalisation derives the lack of languages with accusative case and ergative agreement alignment. Consider again (1), from English, repeated here. (1a) shows us that arguments with unmarked case can agree with the verb.

(1) a. She sees.
    b. She sees them.
    She see them.

For a language to be of the non-existing type, it would have to have case-marking like in (1), but agreement with the object, as in (1”). But since (6) suggests that subjects without case-marking must be able to agree, there is no way to skip the subject and agree with the object instead, as in (1”). So it is impossible to derive an accusative case-marking pattern and ergative agreement.

What is the moral of this story? Languages show all kinds of variation in their syntax, morphology and phonology, but the patterns shown here tell us that this variation is not without limits.


1. This does not necessarily mean that Hindi isn’t ergative after all — think of an English sentence like John likes Mary. Here, there is no case-marking either, but if we look at (1), we can tell that English does have object case. Similar reasoning holds for Hindi.

2. More technically, the idea is that there is a functional element, Tense, that is high in the syntactic structure, higher than both the subject and the object. It agrees downwards and finds the subject first — which can only be skipped if its case is opaque for agreement.


The main idea here, namely case-marking influencing agreement, is from Bobaljik (2008):

Bobaljik, Jonathan David (2008). ‘Where’s Phi?: Agreement as a Postsyntactic Operation’. In: Phi theory: Phi-Features across Modules and Interfaces. Ed. by Daniel Harbour, David Adger and Susana Béjar. Oxford: Oxford University Press, 295–328.

The Hindi data are from Bhatt (2005):

Bhatt, Rajesh (2005). ‘Long Distance Agreement in Hindi-Urdu’. Natural Language & Linguistic Theory 23.4, 757–807. doi: 10.1007/s11049-004-4136-0.

The Nepali data are from Bickel & Yādava (2000):

Bickel, Balthasar and Yogendra P. Yādava (2000). ‘A fresh look at grammatical relations in Indo-Aryan’. Lingua 110.5. doi: 10.1016/S0024-3841(99)00048-0.

Li (2007) explains some of the difficulties in classifying split ergativity:

Li, Chao (2007). ‘Split ergativity and split intransitivity in Nepali’. Lingua 117.8, 1462–1482. doi: 10.1016/j.lingua.2006.09.002.