What does your accent say about you?

Don’t worry, this is not going to be a judgmental blog post. I really, really enjoy different varieties of native and non-native English – although in rare cases I have been heard teasing friends with their ways of speaking. Instead, I hope it will be the kind of blog post that inspires reflection, while trying to impart some of the overly enthusiastic sociolinguist author’s fondness for pronunciation patterns.

I guess I should start by explaining what I mean by accent. I chose this everyday term to cover roughly speaking the part of linguistic variation which isn’t covered by grammar or word choice (sociophonetic variation in linguist terms). Although the way you pronounce your words might seem insignificant, such variation is actually able to impart quite a bit of information about you. In order to detangle the most important ways your accent can differ from others’, I’m going to divide accent differences into 3 broad types, depending on what aspect of communication is leaving those traces in the way you speak.

Cellos and violins

Let’s start with what is arguably the most basic source of phonetic variation: differences in our physiology. Our bodies, mouths and throats are different from each other, and this affects the sounds they are able to produce, rather like the differences between cellos and violins. This affects multiple levels of the way we speak, but a tangible example is the differences in our speech organs that are caused by sexual dimorphism. Men generally have larger vocal folds than women, and, like the strings on cellos and violins, this affects the pitch range we are able to produce: because they are generally larger, men’s vocal folds vibrate at lower frequencies than women’s, which leads to a deeper pitch. Age generally changes the vocal folds, making them less flexible, which is why older people frequently sound more hoarse or creaky. A similar effect can happen if you have a cold or smoke over longer periods of time, both of which can create changes to the structure of the vocal folds. Although these differences are only probabilistic (some men have high-pitched voices), most people find they’re able to guess the approximate sex and age of a voice.


Secondly, your accent is influenced by your social circumstances. Social phonetic variation originates from associations between accent features and groups of people, in the same way as someone saying “yo” makes you think of rap culture. A well-known type of social variation is geographical pronunciation patterns – you sound like someone from Yorkshire because you use a number of accent features which people associate with Yorkshire speakers. Interestingly, this kind of variation is likely to affect what people think of you. The BBC Voices project recorded 34 accents of English and made about 5000 British listeners judge how attractive and/or prestigious they sounded. The researchers found that accents associated with stereotypes of power, like American English and German English, ranked high for prestige but low for attractiveness, whereas e.g. Southern Irish English and Caribbean English were ranked low for prestige but high for attractiveness. Interestingly, awareness of such links can be utilised for sociolinguistic ends. For example, in Japanese a trend has been reported for women, who naturally have high-pitched voices, to make their voices even higher to come across as more feminine. Similarly, some homosexual men speak at a higher pitch range, thus associating themselves with a less stereotypical kind of masculinity. Which accent features are being used in this way partly depends on their noticeability. Some accent features are highly noticeable, like the Uptalk or HRT intonation pattern, and these can be used – or left out – as part of a conscious strategy. Others accent features that are less noticeable are used as part of a long-standing speech habit, but can nonetheless be used by listeners to unpack links between you and social groups.


The last type of accent variation that I’m going to cover arises from the context of the conversation itself. Conversational context, such as what you’re talking about and who you’re talking to, also affects the way you sound. Contextual accent variation can be part of a long-standing habit which gets activated by certain situations, like the way you talk differently when you’re in a formal situation such as at a court of justice, or the difference between talking with a close friend in comparison with someone you’ve just met. It can also be conditioned by the immediate situation, like if someone has mentioned Wales and you do a poor imitation of a Welsh accent. Emotions, like if you suddenly feel happy or angry during a conversation, can also affect the way you sound. Many people report they can hear if a person is smiling, even if they can’t see them. Like with social variation, contextual variation constrains and/or enriches the other kinds of accent variation – it might be the case that you identify as a hip-hopper and generally try to sound like a black American (= social variation), but if you’re taking an IELTS test, you’ll probably try and sound as standard as possible.

As an experiment, next time you’re speaking with someone on the phone, put on your detective hat and try and identify how much you would be able to infer about them from their voice alone. You’d be surprised about how much subtle cues such as vowels and consonants, voice quality or rhythm really say about people. And about you.

Manchurian Chinese-Japanese Pidgin: a language born in wartime

Attention: This article contains Chinese and Japanese text. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Chinese characters, Japanese kanji and kana.


Talking about pidgin languages, we might all have some typical examples in our mind, from a stereotypical pidgin like Chinese Pidgin English, to some highly creolised languages, such as Tok Pisin. In case you haven’t an idea about what a pidgin language is, it is usually charactarised as being a simplified but conventionalised language without any native speakers. Although they may differ in some aspects of grammar, they share the basic process by which they emerge and develop: in simple terms, they come from constant language contact, and a major purpose is to construct a lingua franca among people speaking different mother tongues. Most of the time, this process is quite “peaceful” – the word “pidgin” itself, which is the (mis)pronunciation of English word “business”, has already provided some evidence: these languages often originated from a series of commercial activities between different countries.

But in the world of pidgin languages, a peaceful development is not always the case. Not only commercial activities can lead to different languages coming into contact, and the birth of a new pidgin language, but also invasion and wars. In the dust of history, we can trace some dead pidgins to their dirty background, and by studying these languages, we can reveal some fragments of the past. In this post, I would like to introduce a pidgin “Manchurian Chinese-Japanese Pidgin”, which is dead nowadays but was alive and well during the Manchukuo era (1932-1945) and the Second Sino-Japanese War (1937-1945). Born in the war, this language has a complicated history of language policies and negative evaluation, but it’s worth closer investigation.


A brief history of Sino-Japanese language contact

Although historically the Japanese language has received a series of influences from archaic Chinese, I would like to skip that part and only focus on the situation of modern Japan, i.e. after Meiji Restoration (1868). Since China was one of the biggest military targets of Japan at that time, Chinese was an important part in the training of the military forces. When the Imperial Japanese Army Academy was founded in 1881, Chinese was already one of the well-established courses; specialised textbooks were published during the First Sino-Japanese War to help Japanese soldiers to fight and communicate in China (Ando, 1988).

Right after the Russo-Japanese War in 1905, Japanese started its influence in the north-eastern China: while the South Manchurian Railway was being built, a large group of Japanese soldiers settled down along the railway, and their language started to mix with the local Chinese dialect. This mixed language, also known as a pidgin, is called “Railway Mandarin” (沿線官話) in Standard Chinese; it is the embryo of Manchurian Chinese-Japanese Pidgin.

The establishment of Manchukuo in 1932 was a signal that Japan had started its colonial action in China; the Manchurian government took over the north-eastern part of China and Japanese immigrants rushed in. After the establishment of Manchukuo, language contact between Chinese and Japanese became more frequent, and some policies regarding the use of languages were issued by the Japanese government. Chinese residents in Manchukuo and other Japanese-occupied areas were forced to study Japanese in school, and top-performing students could go to Japan for higher education (Kawashima, 2006). At the same time, the Japanese military continued its invasion into other parts of China, and they needed to learn Chinese in order to command the local Chinese people.

A complicated grammar of Manchurian Chinese-Japanese Pidgin

The Chinese-Japanese pidgin language developed in the Manchukuo era and Second Sino-Japanese War has a number of names, because it is referred to in different historical and linguistic texts. Here I partly follow the suggestion of Sakurai (2012, “Manchurian Pidgin Chinese”), and adopt a neutral name “Manchurian Chinese-Japanese Pidgin” (MCJP), because that name clearly shows the main features of that language: it is a pidgin language of Chinese and Japanese; it clearly shows features of both Chinese and Japanese; the language has not been creolised; and, its “hometown” was Manchukuo. Some other names are also used in historical records: the name accepted by most Chinese scholars is Xiehe Yu (協和語, pronounced as Kyowa-go in Japanese, literally “Concord Language”; see Sakurai 2012 for a review), while Ando (1988) refers to one particular MCJP variant as Military Chinese Language (兵隊シナ語, sometimes 兵隊中国語).

MCJP is marked by its complex origins, numerous variants and complicated structures. As I explained in the previous section, both Japanese native speakers and Chinese native speakers needed to use a pidgin language to communicate; however, due to the distinct patterns of language exposure, they ended up constructing several variants of pidgin language. We could then divide this pidgin to two main branches: one of them uses Chinese as its target language (hence MCJP-Chinese), and the other uses Japanese instead (hence MCJP-Japanese). According to historical documents, these two variants stabilised in early 1940s and were widely used in the communication between Japanese soldiers and Chinese people (Ando, 1988; Sakurai, 2012). Thus, they are not simply the product of language learning and foreigner-talk. Due to the similar function and status they have in the language environment of Manchukuo, in recent research they tend to be treated as different variants of a single pidgin (Sakurai, 2012; Zhang, 2012), even though at first glance they look pretty different from each other.

Young people who grow up in China may still remember some remarkable scenes in TV programmes about the Second Sino-Japanese war: usually, there is a commander dressed in the yellow-green Japanese military uniform, with a little round moustache, shouting to the hero or heroine who is fighting against the Japanese, “你的, 良心, 大大的坏了” (“you, the conscience went bad badly” which is a weird variant of “你没有良心” – “you don’t have conscience”). Such strings are easily recognisable by Chinese people who have little knowledge of Japanese, but they do not sound like “authentic Chinese sentences” to the native speakers – you seldom hear a Chinese native speaker put her main verb at the end of a sentence without any auxiliary verb like ba or bei, unless she intends to achieve certain rhetorical effects. Those lines are vivid examples of MCJP-Chinese, which is often used by Japanese soldiers; in Ando’s (1988) analysis it is called Military Chinese Language.

Chart 1: The syntax structures of MCJP-Chinese and Standard Chinese.

MCJP 良心 大大的
You Topic-marker conscience badly go bad PAR
Topic (subject) (Object) Main verb
Chinese 没有 良心
You have no conscience
Subject Main verb object

From the comparison above, we can see that the sentence structure of MCJP-Chinese mainly follows Japanese; it has an SOV word order, and the particle “的” becomes a part of the topic, which is similar to a topic-marker (Zhang, 2012; while Chinese does not have any pronounced topic-marker). Practically, however, the target language of MCJP-Chinese is Chinese; the main vocabulary and the pronunciation of words are rough replication of Chinese words and sounds, which is shown in Chart 2.

Chart 2: The pronunciation of MCJP based on Chinese and Standard Chinese (example from Ando, 1988)

MCJP expressions ニーデ カンホージ プシン テンホー ファイラ
Pronunciation /ni:.de/ /kan.ho:.ʥi/ /pɯ.ɕing/ /teŋ.ho:/ /ɸɯai.ɾa/
Corresponding Chinese expressions 你的 干活计 不行 顶好 坏了
Pronunciation /ni.tə/ /kan.xuɔ.ʨi/ /pu.ɕing/ /tiŋ.xao/ /xuai.lə/
Meaning you or your labour, work not good excellent go bad

The other variant of MCJP, whose target language is Japanese, was used by the Chinese native speakers in the Japanese-occupied area, especially in Manchukuo. Those Chinese people were forced to learn Japanese in school, but the level of language education varies place to place, and Chinese language was still the dominant language in their daily communication with other Chinese people; such situation led to the development of a “new type” of Japanese, which includes code-mixing and some special grammar. It was once regarded as a “low-level” Japanese variant by the Japanese native speakers, but recent studies show that it is in fact a pidgin that has been ignored (Zhang, 2012).

One prominent characteristic of MCJP-Japanese is the use of the auxiliary verb aru (Hiragana ある; Katakana アル); in Japanese, aru should only be used after transitive verbs and to indicate the meaning of state modification, but in MCJP-Japanese, it replaced other auxiliary verbs and lost its original meaning. Interestingly, not only Chinese people in the Japanese-occupied area, but also Japanese soldiers whose mother tongue is Japanese adopted this structure in the conversation to Chinese people:
The local Chinese: 二十銭安いあるか?(“Is twenty cents cheap enough?” Standard Japanese: 二十銭安いですか?)
The Japanese soldier: うわ!高いあるな!(“Oh! Too expensive!” Standard Japanese: うわ!高いですな!)
(Example from Zhang, 2012)

As well as this extension of aru, a number of particles in Standard Japanese are dropped in MCJP-Japanese, especially the case-markers like “が” (ga, nominative marker), “を” (wo, accusative marker) and even “の” (no, possessive marker) (Zhang, 2012).

Although we can generally divide MCJP into two different variants, there is no clear boundary between them. In the actual language material, we can discover that a speaker can switch between two variants of MCJP and even mix them up in one sentence, no matter what his native language is. From the perspective of sociolinguistics, MCJP is one lingua franca used in the Manchukuo and other Japanese-occupied areas in China during the Second Sino-Japanese war.

Nowadays, even though MCJP disappeared several decades ago, its influences are still obvious. In some fictional works based on the history of that war by Chinese and Japanese authors, we can discover a trace of MCJP, for example, the use of aru of MCJP-Japanese has become a (wrong) stereotype in Japanese comics to show that a Chinese character is speaking Japanese, while the iconic line “你的, 良心, 大大的坏了” of MCJP-Chinese is also a stereotype of Japanese learning Chinese language.

Although the war was a terrible event, the language that it bore is precious to linguists in China and Japan, because it opens a door to understanding the development of a pidgin language.


References (I am terribly sorry that all of them are in Japanese; there is no reliable English resource about this language, which is a great pity):

Ando, H. (1988). Chinese and Modern Japan (中国語と近代日本). Tokyo: Iwanami Shoten.

Kawashima, S. (2006). War-time system and Japanese language – Japanese research (戰時體制與日本語·日本研究). Proceeding in International Symposium of Transformation of Modern Japanese Society, Taiwan: Academia Sinica.

Sakurai, T. (2012). Manchuria Pidgin Chinese and Kyowa-go (満州ピジン中国語と協和語). Meikai Japanese 17:2. Retrieved from http://www.urayasu.meikai.ac.jp/japanese/meikainihongo/17/sakurai.pdf

Zhang, S. (2012). The language contact in Manchukuo: Realities of language contact shown in new materials (「満洲国」における言語接触-新資料に見られる言語接触の実態).  Retrieved from https://glim-re.glim.gakushuin.ac.jp/bitstream/10959/2750/1/jinbun_10_51_68.pdf

Duality of Patterning: Some musings on one design feature of language

A few weeks ago there was a two-part programme on BBC entitled Talk to the Animals presented by Lucy Cooke. As you might imagine, it was about ‘cracking the animal code’ – finding out what animals are communicating with each other and how they are doing so. It was a great programme and got me thinking again about the differences between humans and other animals in terms of the way we communicate.

Our most stand-out method of communication is, of course, language. And our language is used to communicate just about anything and everything we can think of. Whilst animal communication typically concerns food, danger and mating, human communication goes way beyond these things. The more interesting question for me, however, is not so much what we are communicating, but how we are communicating it. How do we package the information we wish to convey and how do we structure it? How is language designed such that it allows us to do these things in the first place?

This question is huge and, surprise, surprise, unanswered. Therefore, I’m simply going to muse on one of the most significant design features of language that has been identified – duality of patterning.

Every human language has a system by which meaningless sounds are combined to make meaningful units (these can be thought of as words), and every human language has a separate system which combines these meaningful units into phrases and sentences (the same applies to Sign Languages). This means that a language can have a reasonably small number of meaningless elements from which it can generate a very large number of distinct words. Furthermore, this very large number of distinct words can be combined to form an even larger number of distinct sentences (in fact, an infinite number of sentences). The capacity of human language to take discrete elements from one level and combine them to make discrete units at another level is what Charles F. Hockett called duality of patterning (Hockett 1960).

It is an immensely efficient way of doing things. Imagine what language would be like if this were not the case. To be meaningful at all, the elements of language would have to be meaningful in and of themselves. Since there would be no way of combining them, we could only express as many things as we have words for. The shapes of these words would be chaotic as well since there would be no way of combining smaller meaningless elements into words.

A number of authors have suggested that a system for combining meaningless elements into meaningful words does exist in other animals, e.g. humpback whales and chaffinches (see Hurford 2007), but a system for combining meaningful words into phrases and sentences appears to be much rarer and possibly unique to humans. Why should humans have two combinatorial systems at their disposal? Or could it be that the two systems are fundamentally the same but appear different purely because of the nature of the elements they manipulate? This suggests that studying the similarities and differences between phonology and syntax will shed light on the underpinnings of our combinatorial abilities (see Nevins (2010) who argues that the operation Agree is found in both syntax and phonology). Comparing these with the abilities of other animals may then shed light on the evolution of language itself.


Hockett, C. F. (1960). The Origin of Speech. Scientific American, 203(3), 89–96.

Hurford, J. R. (2007). The Origins of Meaning: Language in the Light of Evolution. Oxford: Oxford University Press.

Nevins, A. (2010). Locality in Vowel Harmony. Cambridge, MA: MIT Press.

Are you a Belieber?

As I sat on the edge of my sofa on Saturday night watching Doctor Who and trying to acclimatise myself to a slightly softened version of Malcolm Tucker as the new identity of everyone’s favourite Time Lord, I wondered whether I could call myself a Whovian. A Whovian, as you may know, is someone who self identifies as a part of the Doctor Who fanbase. It is one of the seemingly endless set of terms that have been created to describe one’s particular fandom affiliation.

Nicknames for groups of fans have been around for a long time, for example the name Whovian was first used in the 1980s when fans created a fan club newsletter called the Whovian Times. For years we have heard football fans identifying themselves as part of the Toon Army or as a Gooner (Newcastle City fans and Arsenal fans, respectively). However, there seems to have been a recent explosion of fan nicknames in a host of areas: music (Beliebers, Directioners, Swifties), TV (Sherlockians, Gleeks), books (Ringers, Tributes, Twihards), films (Trekkies) and even celebrities (Cumberbitches, Pine Nuts). I want to consider some issues in this post: Why do we feel the need to create these nicknames? Why do some fan groups have nicknames whilst others do not? What makes a good fandom nickname and how are these created?

Firstly, why are these nicknames coined? I think there are four core reasons:

  1. To identify an ingroup. With the rise of the internet, people are exposed nowadays to media from across the world. Young people growing up with this access to global content may be rejecting typical labels such as nationality, religion or political affiliation in favour of associating themselves with personal interests. It is notable that the suffixes often used to create these fangroup names appear to come from those used for nationality or local identity (such as –[i]an used in American and Argentinian or –er used in Londoner and Westerner). By choosing one’s own label and ingroup, you are aligning yourself with a particular community that shares similar values. The names of these fan groups can act as a shibboleth. Although it may be easy to discern that a Belieber is a Justin Bieber fan, would you necessarily know that Smilers are Miley Cyrus fans? These names can create exclusivity where only those ‘in the know’ can be part of the group.
  2. To identify a community. Early fangroup names seem to stem from films or TV shows that held conventions. The names Warsies (Star Wars fans) and Trekkies/Trekkers (Star Trek fans) formed before the age of the online forum. The fact that people actually met in person and created a community around these brands is what helped create their monikers. Now that there are online forums and blogs for almost anything one can imagine, it has allowed communities to form in the virtual world. From these communities, fan group names have been coined. It is arguable that to be deserving of the fan group nickname, one must engage with these communities either online or in person. I might be a huge fan of Doctor Who, but having never visited any fan sites or attended any conventions, I probably could not consider myself a Whovian. So strong are the communities for some of these groups that there are dating websites entirely based around one’s particular fan community (for example, www.whovianlove.com).
  3. To create layers of fandom. A personal admission: I have three One Direction songs on iTunes and I know some of the band members’ names. You could perhaps say I am a One Direction fan. You would probably not say I am a Directioner. A Directioner is more than just someone who has some One Direction music on their iPod. It is someone who has memorised all the lyrics, knows where Harry Styles was born, has queued up for hours to buy tickets to their shows and so on. Nicknames for fan groups provide the superlative on a scale of commitment. It is possible to imagine someone saying: “She might listen to Justin Bieber, but she’s not a Belieber like me.”
  4. To defy haters. It seems that early fan group nicknames (such as Belieber or Directioner) were a means of unifying fans and standing up against those people who criticised the objects of fans’ affections. Perhaps it is the case that the more divisive the thing in question, the more likely it is to have a fan group nickname.

This brings me on to another question – why do some fan groups have nicknames and some do not? Some brands are enormously popular and yet do not have a fan group nickname. For example, Oprah Winfrey is arguably the most powerful woman in America. She has immense influence, is allegedly worth $2.9 billion and has over 25 million followers on Twitter. However, her many millions of fans do not have a nickname. I think this is due to two of the reasons mentioned above. As nicknames may stem from brands being divisive, there must be a feeling that the brand needs defending. Oprah is not criticised enough for her fans to rally together under one name. Secondly, to create an ingroup, a brand must be in some way exclusive. Oprah is too ubiquitous and popular to really be the source of an ingroup and therefore a fan name. Other huge fanbases that do not have a clear nickname include fans of Game of Thrones (or more generally the book series A Song of Ice and Fire) and fans of Harry Potter (notably some people call this group Pottheads but this started as a derogatory term and does not unite the fanbase). In these cases I suggest the final reason is at play again. With their phenomenal popularity, one cannot affiliate oneself with these brands as an ingroup due to the sheer size of the fanbase. However, one may choose to affiliate with certain characters or groups within the brands. For example, fans may side with the Lannisters or the Starks in the Game of Thrones fandom and Gryffindor or Slytherin in Harry Potter. Indeed, some of these sub-sections do have fan group nicknames. For example, the group of Harry Potter fans who wish that Hermione had chosen Harry instead of Ron call themselves Harmonians!

So, how are these nicknames formed? One way nicknames appear is that the artists select them themselves. This is not a new occurrence, with George Harrison calling the superfans of The Beatles, who gathered outside the Apple Corps building, Apple Scruffs. In 2009 Lady Gaga dubbed her fans her Little Monsters (after her album Fame Monster) and Ke$ha called her fans Animals (after her album Animal). However, often the communities themselves develop the nicknames for themselves. Sometimes they have a selection of names that they ask the celebrity to pick from (for example, Ed Sheeran picked Sheerios from a some fan-suggested possibilities). Sometimes the fan group names are selected and the artist in question does not necessarily approve of the choice (for example, Benedict Cumberbatch would prefer his fans called themselves Cumberbabes or the Cumber Collective, but they have dubbed themselves his Cumberbitches). When fans do select nicknames, it may be that a number abound for a while until one wins out (as with Ringers from a number of other possibilities for Lord of the Rings fans, such as LOTRians). In the case of fans of the Hunger Games series, they rather democratically had an online vote to choose their fan name.

So then, how does one create a good fan group nickname? The easiest way is to take the name of the object of your affection and add a suffix. The most popular appear to be –ers, –ies and –ians. Notably fanbases seem to steer away from the suffix –phile (the suffix that means ‘to have a fondness for’) perhaps due to unfavourable connotations from use of this suffix in unsavoury words such as paedophile and necrophile. A second option is to create amusing portmanteaus, such as Gleeks (from Glee + geek), Twihards (from Twilight + try-hard), Bey Hive (from Beyonce + Bee Hive), Fanilow (from fan + Barry Manilow) and, finally, for men who like a retro kids TV show, Brony (from brother + My Little Pony). As mentioned earlier, the fan group nickname can act as a shibboleth and therefore some groups may choose something that is slightly more obscure so that only ‘real fans’ will understand the meaning. For example, the Hunger Games’ fans chose Tributes as their nickname (a term used in the books to describe a certain heroic group to which most of the main characters belong). Similarly, Miley Cyrus fans are called Smilers, originating from the fact that Miley was nicknamed smiley when she was child. Bruce Springsteen fans call themselves Bruce Tramps due to one of his song titles and Katy Perry fans named themselves KatyCats due to their idol’s love of cats.

Due to inherent narcissism I could not help but consider what my fans would call themselves if I ever gained celebrity. I think that Rowena would only have to drop its first syllable to become a passable fan group name. Therefore, I can only hope that I maintain obscurity to save any group from ever having to declare themselves Weeners.

Québec, Language, and Identity

At the end of last month, I attended the 36th Annual meeting of the Cognitive Science Society in my hometown of Québec City, Canada. I was working as a student volunteer and found that several of my colleagues, from all corners of the world, were surprised that many residents of Québec City spoke little or no English. This has had me thinking about Québec’s linguistic situation and the fact that it remains strikingly misunderstood outside the province. What follows is my attempt to draw a brief portrait of Québec’s linguistic state of play. Let me begin by saying that I am perfectly aware that these remarks are tinted by my own personal experience growing up in Québec and would require a more careful examination than the one I can provide here. Nevertheless, I think they are important for discussing the Québécois case.

Until the 1960s, the francophone majority in Québec was relatively poorly educated and worked mainly on the production lines whilst the anglophone minority more likely occupied professional and managerial roles. English was dominant in Québec both economically and socially and Québecers found themselves having little levering power. This situation is denounced in Michèle Lalonde’s now famous 1968 Speak White poem. In 1960, La Révolution Tranquille (The Quiet Revolution) laid the groundwork for the social and economic emancipation of Québecers. This pivotal period of change in Québec’s history was marked by the secularisation of the state and the development of state institutions (e.g. public healthcare and education). These social advancements provided a fertile ground for the rise of a new French-speaking middle class that could now aspire to hold higher positions in society. As this new class rose, establishing French as Québec’s official language became a priority. In 1977, La Charte de la langue française (Loi 101) (The Charter of the French language (Bill 101)) defined French as the official language of the province; “the French language, the distinctive language of a people that is in the majority French-speaking, is the instrument by which that people has articulated its identity” (Preamble, Charter of the French Language). To this day, Bill 101 remains central to linguistic legislation in Québec, from education policy to signposting.

It is clear that Québecers’ attitude towards the question of language is a complicated one, perhaps unsurprisingly given the obsequiousness they were long expected to adopt in such matters. Discussions of language often seem to awaken delicate sensibilities relating to identity, culture, and politics. But what is the other side of the story? What is the current linguistic situation in Québec? A recent report published last May by Québec’s University of Public Administration raises important issues regarding the teaching of English as a second language (ESL) in the province. The report concludes that despite efforts from Québec’s government to improve ESL teaching, much remains to be done to increase the number of teaching hours pupils receive and to ensure that improvements generalise to all regions of the province, not just the city centres. Evidence cited by the authors of the report suggests that 1200 hours of study are required to reach a basic level of proficiency in a second language. However, the report states that pupils in Québec currently receive on average only 800 hours of ESL teaching during primary and secondary school education. The report highlights the crucial importance of bilingualism as a means for innovation and economic prosperity in the province as well as for the competitiveness of its workforce in a global economy.

There was a time where the francophone majority in Québec seemed terminally threatened and much progress has been made to endow the province with the fundamental linguistic legislation and rights to firmly establish itself as a French-speaking province within an English-speaking country; a nation within a nation if you will. It is not necessary to go very far back in Québec’s history to understand why we have, as a nation, fiercely defended our language to combat oppression. However, I believe there are real risks attached to equating language with identity, not least of which is the impending risk of depriving new generations from acquiring an adequate level of proficiency in English for fear of acculturation. The danger of preventing new generations from becoming more competitive and active in an ever-expanding world is lurking. If there is anything that imperils Québec, it is the risk of wasting another generation on sterile debates. Whilst Québecers need to learn about and protect their linguistic heritage, they need not fear English as the threat it once was but embrace it as the instrument that could give them the upper hand.


Éditeur officiel du Québec (August 2014). Charter of the French Language. Retrieved from http://www2.publicationsduquebec.gouv.qc.ca/dynamicSearch/telecharge.php?type=2&file=/C_11/C11_A.html

Centre de recherche et d’expertise en evaluation (CREXE) (May 2014). Recherche évaluative sur l’intervention gouvernementale en matière d’enseignement de l’anglais, langue seconde, au Québec. Retrieved from http://crexe.enap.ca/cerberus/files/nouvelles/documents/AnglaisIntensif_ENAP_Rapport3.pdf

Lalonde, Michèle (1974). Speak White. Montréal: L’Hexagone.



Prepositions and national identity

Last week I attended the 5th Sociolinguistics Summer School in Dublin, Ireland. Being, as it were, directed primarily at early-career researchers, the talks offered a good overview of what young sociolinguists (that’s linguists interested in the relationships between social and linguistic variation) are up to these days. There was a pretty impressive amount of papers and posters on new media – Twitter seems to be a fairly fashionable research topic among our lot – and, being in Ireland, the summer school had attracted quite a few talks on minority languages and dialects such as Irish Gaelic, Catalan and, yes, you guessed it, Australian Aboriginal English. Among the many interesting talks, one in particular has had me thinking this week: the final one, presented by Anne Marie Devlin of University College Cork. 


Her talk was entitled “Prepositions on the battlefront: ‘В’ and ‘На’ as indices of socio-political identity in the current conflict between Ukraine and Russia” and, as the title indicates, it focused on the socio-political role of language in Ukraine. According to Anne Marie, the current socio-political conflict is now shaping the ways language is being used in Ukraine, most notably resulting in Russian being given preference in different social spheres, including the sociolinguistic landscape. In this way, Ukrainian-language signs are being removed and replaced with signs in Russian. More subtly, though, her talk demonstrated that small cues like the use of prepositions can be just as powerful tools in signalling socio-political opinion. Speakers of Russian have access to two different prepositions collocating with the word “Ukraine”: “v”, which roughly corresponds to the word “in” in English, and “na”, which means something akin to English “on”. The “in” preposition is used to refer to nation states, whereas the “on” preposition is used with counties or islands, that is, parts of a larger nation state. In this way, through the consistent use of one of these prepositions, a Russian speaker can signal her attitude to Ukraine’s national status. And indeed, after combing through a number of newspapers, letters and online forums, Anne Marie concluded that preposition use in both Ukrainian and Russian media strongly correlate with the political opinions expressed. Writers in favour of an independent Ukraine would almost exclusively use the “in” preposition, and vice versa.

Nuuk (Anna)

This got me thinking. As a Dane, I’ve noticed, but never really given much thought to, a similar sociolinguistic situation at home, which hinges on the political relationship between Denmark and Greenland. For you non-Danes out there, let me explain. Because of its colonial history, Greenland is an autonomous country within the Danish realm. This is a strange in-between state of affairs – it has home rule, but it’s still economically, and to some degree politically, dependent on Denmark. Now, Danes have a similar set of prepositions to the Russo-Ukrainian ones. So the question is: how do we refer to Greenland? My own intuition is to use the “on” preposition, but the “in” variant doesn’t sound too bad either. On the other hand, my stepfather, who is very close friends with a Greenlandic couple, consistently uses the “in” variant. I also recall having a discussion with an Icelandic colleague on a similar matter. Iceland received its independence from Denmark in 1918 and became a republic in 1944. However, lots of Danes still use the “on” preposition when referring to the country, to the irritation of (it would seem) a number of linguistically savvy Icelanders. Protesting this trend, my colleague agued that he associated the Danish “på Island” (“on Iceland”) with derogatory views of the country. In other words, preposition use seems to be able to trigger similar sociolinguistic effects in Danish, even within the current calm Scandinavian political climate. And just as interestingly, the status of these prepositions as sociolinguistic markers seem to have completely escaped the attention of Danes. This is why I like early-career conference presentations – they can be real eye-openers!

Denmark map

To end these musings, let me bring them closer to home. As a non-native speaker of English, I’m not as sensitive to linguistic differences in this language as I am in Danish. So, English speakers, this is where I ask for your opinions. Are prepositions used in similar ways in English? Do you use “in” or “on” with the Solomon Islands? Jamaica? The Channel Islands? The Hebrides?

I don’t know about you, but I’ll be paying closer attention to preposition use in media coverings from now on.


On the (in)completeness of language and thought

The relationship between language and thought is a fascinating field for investigation because, however effortlessly we seem to think and speak, a closer look reveals that the interaction between these is not as simple as we may take it to be. In a previous post I tried to show that, despite the fact that language and thought are tightly intertwined, they do not overlap; some evidence for this comes from considering ways in which we conceive and communicate thoughts without using words, just as dance partners communicate their next using body signals alone. In this post I talk about the relationship between language and thought from the point of view of my own research topic: how are we able to convey complete thoughts using sentences that are incomplete from a syntactic point of view? Since I don’t have a full answer to this problem (yet!), I will talk more generally about the mismatch between language and thought as far as the aspect of completeness/incompleteness of each is concerned; a mismatch which does not get in the way of efficient communication.

Roughly speaking, when we talk about syntactically complete sentences, we mean the ones that involve at least one predicate/verb, e.g. ‘I’m sitting in the sun’, ‘John is British’, ‘Anna was late last night’, etc. Such simple complete sentences have traditionally been the primary unit of analysis for linguistic theory. However, the main difference between such sentences and the ones we use in actual conversations is that the latter never occur in isolation, but rather in a context. They normally occur in sequences of sentences, and are placed in a certain context, consisting of a time, place, topic of conversation, a person we are addressing, etc (for more about the notion of context see Finkbeiner et al. 2012). This way, each sentence that occurs in a conversation can build on previous ones, as well as on information that is already given in the context. Thus, as speakers, we do not have to make explicit every single aspect of the meaning we want to communicate because we can trust that some information is already present in the context (and, as such, known to our interlocutors).

There are many ways in which the interaction of utterances with context (linguistic and extra-linguistic) can save us from having to be explicit about every single aspect of meaning we want to convey. For example, if I have already been talking to a friend about my housemate, I can then afford to say ‘She is going to Paris tomorrow’, without having to explicitly define the female ‘she’ refers to. Similarly, if I utter ‘I’m ready’, it must be clear from the context what it is I am ready for, otherwise my sentence would not be meaningful (see Bach 1994, 2001). In general, each sentence we use is, roughly speaking, supposed to express a thought that we want to share with our interlocutors. But due to the interaction of sentences with context it is possible for communication to be achieved, even if there’s no one-to-one correspondence between the sentences we use and the units of thought we want communicate.

Lets look at some examples of complete and incomplete language and thought. To do this, we’ll need to use a unit of measurement for each. For language this unit will be the sentence; for thought it will be the proposition (‘proposition’ is, roughly speaking, a term used by philosophers of language to talk about ‘units’ of thought. You can read more about it here). In our everyday conversations, we can see all possible combinations of completeness/incompleteness of units of language and units of thought: we use complete sentences to convey complete thoughts, incomplete sentences to convey complete thoughts, incomplete sentences to convey incomplete thoughts, and so on. These combinations are shown in the table below which evaluates the utterance in the 3rd column with regards to the [+/- complete] feature. By ‘language’ I refer to the sentence explicitly pronounced, and by ‘thought’ I refer to the message conveyed by that sentence alone.

Language Thought

  1. [+complete] [+complete]: ‘Germany won the 2014 world cup’.
  2. [+complete] [-complete]: ‘Everybody went to the beach yesterday’.
  3. [-complete] [+complete] [context: doorbell rings] ‘Probably the pizza guy’.
  4. [-complete] [-complete] ‘The essay was on a complicated topic, but I found it interesting so…’


Cases 1 and 2 are linguistically complete because they involve at least one verb each (won, went). Case 1 also conveys a complete and determinate thought. Case 2, however, does not convey a complete thought because some additional piece of information is required to make ‘everybody’ meaningful, i.e., ‘everybody’ needs to be restricted to the specific group of people the speaker is talking about, because the sentence cannot really mean that everybody in the world went to the beach yesterday. This additional piece of information, e.g. ‘Everybody [from our group of friends/[in my family etc]’, need not be explicitly uttered because it is normally recoverable when the utterance is placed in context. But it is, strictly speaking, not contained in the thought conveyed by the sentence alone, hence the [-complete] feature in the ‘thought’ column.

Moving on to case 3, it is syntactically incomplete given that it does not contain any verbs, we can say that it conveys a complete and determinate thought, because it can only mean something along the lines of ‘[The person at the door is] probably the pizza guy’. [The person at the door is] is recoverable on the basis of the contextual information that the doorbell is ringing, the world-knowledge that pizzas are often delivered at doors, etc.

Case 4 is also linguistically incomplete, despite containing two verbs, because it is explicitly left open-ended (i.e. in English sentences are not meant to end in a connective such as ‘so’). However, it seems less clear whether it conveys are complete thought or not. At first glance, it seems that case 4 does not convey a complete proposition, not like case 1 does, or like case 3 because it wouldn’t be as straightforward to add the completion in brackets. At the same time, deciding that case 4 does not convey a complete proposition at all might not be fair either, because there is an intuitive sense in which 4 is meaningful (and, if something has meaning, then it arguably conveys certain thoughts/propositions). Thus, an intermediate solution would be to say that 4 conveys the complete proposition ‘The essay was on a difficult topic, but I found it interesting’, i.e. the thought that is conveyed by the part that comes before the open-endedness, and that, in addition to this, the open-ended part conveys a much more vague (and arguably incomplete) aspect of meaning along the lines of ‘You can easily infer from what I said that there were pros and cons with regards to the essay topic’, ‘The fact that I found it interesting made the complicated topic easier for me’, or even ‘I leave the conclusion of what I said up to you, because I’m ambivalent with regards to the essay topic’ etc. In a way, open-endedness expresses not a proposition but an attitude towards a proposition (for more on this see here).

Abstracting away from the details, however, what we are left with is four cases which, if used in an appropriate context, would be perfectly interpretable by any average speaker of English; moreover, no average speaker of English would judge them as ungrammatical or infelicitous. The fact that these sentences will eventually lead to successful communication basically means that each of them will ultimately convey a complete thought, regardless of how complete or not the components of language or thought involved were initially. This is possible because language is in constant interaction with its context of use which is responsible for completing the incompleteness of either language or thought, and which allows for sentences and propositions to be meaningful, even if incomplete. Given how complex the interaction between language and thought is, isn’t it fascinating how effortlessly we perform the complicated task of communication?


  • Bach, K. 1994. ‘Semantic slack: What is said and more’. In: S. L. Tsochatzidis (ed.). Foundations of speech act theory. Philosophical and Linguistic perspectives. London and New York: Routlege. 267-291.
  • Bach, K. 2001. ‘You don’t say?’ Synthese 128. 15-44.
  • Finkbeiner, Rita, Jörg Meibauer and Petra B. Schumacher. 2012. What is context? Linguistic approaches and challenges. Amsterdam: John Benjamins.

Eleni Savva

The myth of the myth of language complexity

It’s common enough for people to think about languages in terms of relative complexity. I often hear people claim that a language—not infrequently their own language, or a language which they are learning—is particularly complex and difficult to learn due to its large vocabulary, morphological irregularities, or tricky pronunciation. It does seem intuitively obvious that some languages must just be more complex than others. Yet one of the first propositions that many undergrads are exposed to when they begin to study linguistics is that this is actually a myth.

A key tenet of formal linguistics and sociolinguistics for much of the 20th century was that of equicomplexity. This is the idea that all languages are equally effective and powerful means of communication, and, by somewhat shaky extension, that all languages are equally complex. Equicomplexity arose not really from any data-driven research, but from ideological discussions around prescriptivism and descripitivism. You’ll remember from an earlier post on this blog (http://www.icge.co.uk/languagesciencesblog/?p=25) that prescriptivism describes the position of believing that there is a ‘correct’ way to speak, and that to speak in other ways is somehow deficient, while descriptivism is an attitude of open interest towards the ways in which language is used without attaching any value judgements to them. Linguistics—particularly sociolinguistics—holds descriptivism as a core component of its approach, yet throughout much of history prescriptivism has been the mainstream viewpoint.

The—in many ways still largely unsuccessful—battle against prescriptivismhas perhaps necessitated holding simple, powerful ideological positions. Faced with educators who believe that the varieties spoken by their non-white or working class pupils are intrinsically inferior to the standard (calling them ‘illogical’, ‘crude’, ‘rough’, ‘ugly’ or just ‘incorrect’), there seems to be little space to have a sophisticated conversation about the nature of complexity and expressive power. Such views are clearly proxies for racism and classism and serve to perpetuate the grievous structural inequalities that typify western societies. They are best battled with clear maxims, cleanly expressed: All languages are equally powerful tools of communication. All languages are equally deserving of respect. There is no such thing as a simple language.

So, it’s obvious that equicomplexity took its place in the canon of linguistic assumptions for good reason. However, in recent years and not without controversy, scholars have begun to unpick it. Few linguists would argue with the fundamental ideological position underlying the statement that ‘all [natively learned] languages are equally powerful means of communication’, but many have begun to question the leap to the idea that all languages must therefore be equally complex.

It’s clear that in anyparticular area of grammar, languages can be more or less complex. So, English, with two distinct surface forms of each regular noun, is obviously simpler in this respect than Finnish, with perhaps 26. Mandarin, which distinguishes between 19 and 26 different consonants (depending on how you count it), is clearly more complicated in this respect than New Zealand Māori with 10 consonants but less complicated than Adyghe, with over 50. Given this, to maintain that all languages areequally complex overall, one must assume that when one area of grammar gets more complicated, others get more simple to compensate. This has been the implicit assumption underlying equicomplexity for several decades.

The problem is, it turns out that this just isn’t true. If this were true, then whatever our measure of complexity is (—and that’s a whole nother blog post) we should find that in a big sample of languages there is a negative correlation between complexity in one area of grammar and complexity in another. Yet in reality, studies like Maddieson (2006; 2007) and Shosted (2006) show, if anything, a weak positive correlation between complexity in different areas of grammar: languages with more complicated phonology are more, not less, likely to have complicated morphology.

So where does that leave equicomplexity? Well, if we accept these findings then we pretty much have to abandon the idea that all languages are equally complex. It was never backed up by evidence in the first place, and these findings seem to represent some pretty conclusive counter-evidence. It doesn’t, of course, mean that we should abandon the claims that all natively-learned languages are equally powerful means of communication and that all languages are equally deserving of respect. These remain important ideological positions. However, if we can reject canonical equicomplexity, lots of exciting new avenues of research open up to us: Why are some languages more complex than others? How much of language complexity is built into the innate language faculty, and how much is cultural elaboration? What social conditions cause languages to become simpler and what cause them to become more complex? It’s in this latter area that my own research is focused.

A pertinent addendum to all of this has to do with the nature and experience of complexity. When, as I mentioned at the beginning, I hear people talking about how complicated different languages are, they’re almost always interested in the point of view of adult learners. They’re interested in whether they will have to put in more or less effort to learn another language, and in how much effort non-native speakers of their own language have had to make.

The reality is that this ‘ease of learning’ is only partially related to ‘complexity’ in the abstract. The biggest factor which will make another language easy or difficult to learn is not complexity but how closely related it is to your own native language(s) and any other languages you speak. Native speakers of English will find Norwegian or French extremely easy to learn, as (for different reasons) they each share a great deal of vocabulary and structural similarities with English; native speakers of Cantonese may not. Native speakers of languages which do not distinguish tones (e.g. most—though not all—European languages) may find particular difficulty in learning languages which do (most languages of subsaharan Africa, the Chinese languages and related languages, as well as many others).

Having taken this into account, then, yes, morphological and phonological complexity will tend to make for a harder learning process. There is simply a lot more verbal morphology to memorise for a student of Spanish than for a student of Mandarin, and this will take time. Similarly a learner of Hawai’ian won’t have to spend very much energy at all on learning the different consonants they need to be able to pronounce compared with a learner of Halkomelem or another Salishan language, and a student of Danish must learn to distinguish far more vowel qualities than a student of Standard Arabic.

At the end, we have a rather mixed picture. Clearly, in descriptive, neutral terms, some languages are much more complex than others. From a practical point of view for most users of language, though, this has little real relevance. Their experience of language complexity will mostly come down to their own language backgrounds—and even where it doesn’t, it will always be possible to identify particularly complex structures and features of some sort in any language.

Maddieson, Ian. 2006. Correlating phonological complexity: Data and validation. Linguistic Typology 10. 106–123. doi:10.1515/LINGTY.2006.004.
Maddieson, Ian. 2007. Issues of phonological complexity: Statistical analysis of the relationship between syllable structures, segment inventories and tone contrasts. In M.-J. Solé, P. Beddor & M. Ohala (eds.), Experimental Approaches to Phonology, 93–103. Oxford: Oxford University Press.
Shosted, Ryan K. 2006. Correlating complexity: A typological approach. Linguistic Typology 10. 1–40. doi:10.1515/LINGTY.2006.001.

LanguageS in China

Attention: This article contains Chinese text. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Chinese characters.

It all began with a question while I was in a cab from the Cambridge railway station to my college. The driver, after asking where I come from and what my field of study is, asked me a quite simple yet difficult question that kept me busy for the rest of my trip: “so, how many languages are there in China?”

Most people I have met, even Chinese people themselves, do not have a clear idea about the linguistic situation and diversity in China. After all, there is a language named after the country, the so-called “Chinese language”, which is also the lingua franca in China. This description, however, is far from accurate with regards to the real situation of languages spoken in China – China is not a monolingual country, although it is monolingual in some areas. The definition of Chinese language is more complicated than you can imagine, even though everyone knows that the national language of China is called “Standard Chinese”.

In this post, I  focus on several myths about the languages in China, and show that neither “Chinese language” nor “languages in China” are simple concepts.

How many languages are there in China?
There are 298 languages in total, currently spoken by native people in China; some languages are national and regional lingua francas with millions and billions of speakers, while some languages are used by only a few thousands of people in small counties (Lewis, Simons and Fennig, 2014). This number does not include those languages spoken by immigrants, such as English, Arabic or Yoruba; however, it does include some languages that are spoken by ethnic minorities in China which are official languages of other countries, such as Russian, Uzbek and Korean. (There are ethnic minorities of Russian, Uzbek and Korean origins in China whose native languages are recognised among the languages of China.)

Do all the languages in China use Chinese characters?
This is definitely not the case; or, to be more precise, the Chinese language is the only language that uses Chinese characters nowadays. Most of the commonly used languages in China have their own written forms, like Tibetan, Mongolian and Uyghur (using Arabic alphabets); some languages like Zhuang once used Chinese characters for documentation, but Chinese characters have gradually been replaced by Latin characters.

Is there an official language of China?
China does not have a confirmed “official language” – I have double checked the Constitution but there is not a single article with regards to the issue of the official language of the country. However, China does have a standard language: according to Article 2 of Law of the People’s Republic of China on the Standard Spoken and Written Chinese Language (2000), the spoken form of standard Chinese language is Putonghua and the written form should be in Standardised Chinese Character.

In actual use, however, the language policy is more flexible; especially in the areas where ethnic minorities reside, languages other than Standard Chinese are used in both informal and institutional contexts. A good example comes from Renminbi, the currency of China: If we carefully examine a bank note, we will find that it is more similar to Swiss Franc than to Pound Sterling – it is multilingual. A number of languages appear on the note: Chinese (in the form of pinyin), Mongolian, Tibetan, Uyghur and Zhuang. Apart from Chinese, the other four languages are important minority languages in China, and some of them have obtained institutional status in the provinces they are mostly spoken; for instance, Tibetan is an official language in Tibet, part of Qinghai and some areas in Gansu.


So what is “Chinese language”?
The term “Chinese language”, or Hanyu (汉语), is a loosely defined concept. In linguistics, the name refers to a group of linguistic varieties that come from one single ancient origin; the vocabulary and sentential structure of these varieties is generally the same. In general, these linguistic varieties can be classified into seven large subgroups: Mandarin, Wu, Yue (Cantonese), Min, Gan, Xiang, Kejia (Hakka). Here is a family tree of the Chinese languages proposed by You (2000), showing the history and development of these different subgroups.


Due to geographical factors, some varieties of the Chinese language have been isolated from others, and this isolation has led to changes in the way these varieties sound; for example, a native speaker of Shaoxing Chinese may find episodes of TV series in Wenzhou Chinese difficult to follow, if she watches them without subtitles, although the distance between the two cities is only a bit more than 300 km (which is a rather short distance for Chinese standards). This phenomenon is quite common in Southern China, and is called “different pronunciations within five kilometers”.

In traditional linguistic research on Chinese language, these subgroups are labelled “dialects of Chinese language”. I prefer to avoid the term “dialect” because it is not the case that all these linguistic varieties are mutually intelligible, which is the criterion that some Western sociolinguists might use to define “dialects” of the same language.

So you mean we can’t contrast  “Chinese” with “Cantonese”?
Yes, this is indeed the case. Cantonese is a member of the Chinese language group, so it is a branch of the Chinese language; it does not make sense to say “I can speak Chinese and Cantonese” – to Chinese people this sounds equivalent to “I can speak English and London English”. However, we can still contrast  “Mandarin” and “Cantonese”, or “Standard Chinese” and “Cantonese”, because these terms refer to different varieties of the Chinese language.

But what is Mandarin Chinese? Is there any difference between Mandarin and Putonghua?
Mandarin is a subgroup of the Chinese language that is widely spoken in Northern and South-western China; in Chinese, we call it Guanhua (官话), which means “the (Chinese) language spoken by officials”. Varieties of Mandarin do not have a unified pronunciation, but usually native speakers of different varieties of Mandarin can roughly understand each other.

The spoken form of contemporary standard Chinese is Putonghua, whose phonological system is based on Northern Mandarin, and, more specifically, on the varieties spoken in and around Beijing. A simple way to describe the relationship between Mandarin and Putonghua is that Putonghua is a member of the Mandarin group of languages, while Mandarin is a member of the group of Chinese languages. Nowadays, Putonghua is the most representative form of the Chinese language, and when we talk about “learning to speak Chinese”, we always refer to Putonghua.

This was only a sample of the questions that I have been asked to answer over the years, being both a linguistics student and Chinese. I could go on about the languages in China for hours, but I’m afraid I should stop here due to space and time limitations. If you are interested in learning more about the development and categorisation of varieties of the Chinese language, I sincerely recommend Jerry Norman’s Chinese – it is a wonderful introduction to this ancient and beautiful language which will be interesting even for speakers of ‘Chinese languages’ themselves.


Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig (eds.). (2014). Ethnologue: Languages of the World, Seventeenth edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com.

Norman, J. (1988). Chinese. Cambridge: Cambridge University Press.

The Law of the People’s Republic of China on the Standard Spoken and Written Chinese Language. 2000. The People’s Republic of China. 

You, R. (2000). Chinese Dialectology. Shanghai: Shanghai Education Publishing.


Reconstruction in relative clauses

Me again, with more stuff about relative clauses! In my defence, I have been working on reconstruction in relative clauses quite a bit recently, so this represents one way of desaturating my brain. That is not to imply that it is a tedious topic – far from it. Reconstruction effects in relative clauses give us a fascinating clue about how these constructions are built and how our interpretive faculties ‘read’ such structures. I have tried to avoid technicalities and jargon as much as possible, and to keep this blog entry a reasonable length whilst also getting to the core of some very deep questions in current syntactic theory. So, let’s get started.

We’ll start by considering the following data (if two elements have the same subscript, it means that the two elements refer to the same individual; if the subscripts are different, the elements refer to different individuals. The * means that the sentence is ungrammatical).

(1)        a.         Samx likes the picture of himselfx.

b.         *Samx likes the picture of himx.

c.         Samx thinks that Rosie likes the picture of himx.

In (1a), himself must refer to Sam. In (1b), him must not refer to Sam but must refer to some other singular male individual (some speakers find (1b) acceptable (Reinhart & Reuland 1993), but I and most other people I have asked do not). (1c) is ambiguous: him can either refer to Sam (as shown by the subscripts) or to some other singular male individual. The pattern in (1) is traditionally captured by the Binding Conditions (Conditions A and B to be more precise) (Chomsky, 1981). The Binding Conditions are quite technical so I won’t go into them here. What is important is the pattern in (1).

What happens if we relativise picture of X, i.e. modify picture of X with a relative clause?

(2)        a.         The picture of himselfx that Samx likes is quite flattering.

b.         ?/*The picture of himx that Samx likes is quite flattering.

c.         The picture of himx that Samx thinks that Rosie likes is quite flattering.

As we can see, the pattern in (2) is exactly the same as in (1). This suggests that we are interpreting the head of the relative clause, i.e. picture of himself, in the object position of like, since then (2) can be interpreted in the same way as (1). This in turn suggests that the head of the relative clause originated inside the relative clause and was moved to the position in which it is pronounced. However, when it comes to interpreting (rather than pronouncing) the structure, we ‘reconstruct’ the movement and interpret the head of the relative clause in its original position (see Bianchi, 1999; Kayne, 1994; Schachter, 1973; Vergnaud, 1974). For example, (2a) is interpreted as (3), where the bold copy is the one being interpreted. Note that this bold copy is not pronounced.

(3)        The picture of himselfx that Samx likes (the) picture of himselfx is quite flattering.

The bold the is in brackets because technically the determiner the does not reconstruct with the head of the relative clause picture of himself (Bianchi, 2000; Cinque, 2013; Kayne, 1994; Williamson, 1987 on the so-called indefiniteness effect on the copy internal to the relative clause). Reconstruction thus captures the similarities between (1) and (2) in a straightforward way.

In (2), the head of the relative clause served as the subject of the main clause. What happens when it serves as the direct object of the main clause?

(4)        a.         *Mrs. Cottony hates the picture of himselfx that Samx likes.

b.         ?/*Mrs. Cottony hates the picture of himx that Samx likes.

c.         Mrs. Cottony hates the picture of himx that Samx thinks that Rosie likes.

If the head of the relative is picture of him, the pattern is the same as in (1) and (2), which suggests that reconstruction has taken place. However, (4a) is ungrammatical for all the speakers that I have asked (this result is of great significance given what is usually said in the literature). This result is unexpected, especially if reconstruction is available in (4b) and (4c). If reconstruction were available, picture of himself should be able to reconstruct to the direct object position of likes inside the relative clause where it could co-refer with Sam, just like in (3). However, the only interpretation available in (4a) is the ungrammatical one where himself is trying to co-refer with Mrs. Cotton suggesting that reconstruction is impossible.

The difference between (4a) and (2a) lies in whether there is an element in the main clause that himself could get its reference from. In (2a), there is no such element, so picture of himself is forced to reconstruct so that himself gets a reference. In (4a), there is an element, albeit an unsuitable one. This suggests that the Binding Condition which allows himself to get its reference from another element applies blindly/automatically: himself gets bound to Mrs. Cotton automatically, which prevents reconstruction occurring. Later on, when it is time to interpret the binding relation, we discover that we were wrong to have bound himself to Mrs. Cotton, but by this time it is too late to perform reconstruction. This suggests that interpretation of syntactic structure only happens after all syntactic operations have finished. If it didn’t, we might expect that we could repair the mistake in (4a) by reconstruction. However, this is not what we find.

The same effect is also found in other constructions. Based on Browning (1987: 162-165), Brody (1995: 92) shows that (5) is acceptable suggesting that picture of himself has reconstructed to the direct object position of buy (the example is slightly adapted).

(5)        This picture of himselfx is easy to make Johnx buy.

However, reconstruction is blocked if there is a potential element that himself could get its reference from, even if it turns out later to be unsuitable (Brody, 1995: 92).

(6)        *Maryy expected those pictures of himselfx to be easy to make Johnx buy.

We have only touched the surface on reconstruction in relative clauses here (there are more reconstruction effects and more subtleties that I have been working on but which would take too long to lay out here). What we have concluded is that reconstruction is generally available in relative clauses (at least in English). This tells us that relative clauses are constructed with a copy of the head of the relative clause inside the relative clause itself. The problem is how to choose which copies to interpret. It seems that there are structural conditions which force certain copies to be interpreted, i.e. the choice is not completely free. Explaining what these conditions are can thus provide a fascinating clue about how the human mind works (and how it doesn’t).

If you’re keen to find out more, Sportiche (2006) gives a good overview of reconstruction effects and Fox (2000) develops a nice account of how interpretation interacts with syntactic structure.


Bianchi, V. (1999). Consequences of Antisymmetry: Headed Relative Clauses. Berlin/New York: Mouton de Gruyter.

Bianchi, V. (2000). The raising analysis of relative clauses: a reply to Borsley. Linguistic Inquiry, 31(1), 123–140.

Brody, M. (1995). Lexico-Logical Form: A Radically Minimalist Theory. Cambridge, MA: MIT Press.

Browning, M. (1987). Null Operator Constructions. PhD dissertation, MIT.

Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris.

Cinque, G. (2013). Typological Studies: Word Order and Relative Clauses. New York/London: Routledge.

Fox, D. (2000). Economy and Semantic Interpretation. Cambridge, MA: MIT Press.

Kayne, R. S. (1994). The Antisymmetry of Syntax. Cambridge, MA: MIT Press.

Schachter, P. (1973). Focus and relativization. Language, 49(1), 19–46.

Sportiche, D. (2006). Reconstruction, Binding, and Scope. In M. Everaert & H. van Riemsdijk (Eds.), The Blackwell Companion to Syntax. Volume IV (pp. 35–93). Oxford: Blackwell.

Vergnaud, J.-R. (1974). French relative clauses. Doctoral dissertation, MIT.

Williamson, J. S. (1987). An Indefiniteness Restriction for Relative Clauses in Lakhota. In E. J. Reuland & A. G. B. ter Meulen (Eds.), The Representation of (In)definiteness (pp. 168–190). Cambridge, MA.