Finnish, Hungarian… and Sumerian?

When you tell non-linguists that you do linguistics, people often like to showcase their lay interest in the subject by throwing out cool facts about languages. Only sometimes the facts are not so factual. A few weeks back, over tea, I was told that

“I mean it’s quite interesting how Finnish and Hungarian are related to Sumerian.”

I managed not to splutter my tea nor choke on my scone; I dismissed the claim as an interesting but very obviously false belief, going in the same bin with the likes of ‘some languages make you more intelligent than others’ and ‘modern English was language of the Roman Empire’.

But a real tea-spluttering, choking-by-scone incident was soon to follow, when the same claim cropped up not as a harmless fantasy but as state-of-the-art research by the Finnish assyriologist Simo Parpola.

In his brand new, two-volume Etymological Dictionary of the Sumerian language, Parpola argues that Sumerian, the ancient language of southern Mesopotamia, is not, in fact, an isolate without any known linguistic relatives, as is conventionally held; rather, it is related to the Uralic languages, spread across northern Eurasia.

Could Finno-Ugric speakers now take ancestral pride in cuneiform? Image credit: hapal.

Eleven years of comparing Sumerian to several major language families revealed, according to Parpola, that Sumerian and the Uralic languages have more in common than the Finno-Ugric languages (a subgroup of the greater Uralic family) among themselves in their sounds, basic vocabulary, pronouns, and numerals. For example, the Sumerian ajja/aj(a) ‘father, grandfather’ resembles the Uralic forms äijä, ajja, aja, and jäj with similar meanings, while engur or angur ‘primeval sea, home to the water deity’ is akin to onkura ‘cave’.

Not surprisingly, the research has resulted in uproar in the Finno-Ugric linguistic circles, with linguists attacking both Parpola’s methodology as well as his professionalism as a linguist (what does an assyriologist know about the scientific methods of linguistics?). For example, the Sumerian-Uralic comparisons use whole language families, but in historical linguistics comparison sets are usually set up between individual languages. Juha Jantunen, professor of East Asian languages and cultures at the University of Helsinki, summarizes the linguists’ horror at Parpola’s findings: saying that Sumerian is a closer relative of Finnish than the Finno-Ugric Sami is like saying that humans are not, in fact, that closely related to apes but find themselves rather somewhere between the horse and the donkey.

Human relatives – are horses or monkeys closer? Image credit: Ashley Van Haeften.

Parpola is not alone in his conviction, though. The idea that the humble Uralic languages have grander ancestors has been around since the mid-19th century when the cuneiform script was first studied. It was soon noted that Sumerian was agglutinating – sticking meaningful bits of language together into longer words rather than using separate words to express, for example, location – just like Hungarian. Combine that with a national impetus: at the time, Hungarians were a minority in the mighty Austria-Hungary, and what nicer way to gain some kudos than establishing grand linguistic relatives?

What’s more, with a renewed interest in nationalism, alternatives to the standard Finno-Ugric story still live strong in Hungary: after the national conservative Prime Minister Viktor Orbán came to power in 2010, more funding has been directed to research exploring options such as Parpola’s.

That’s not to say Parpola’s findings are in any way influenced by nationalistic motives. But are his results victim of linguistic prejudice, wrongly criticized by linguists stuck in their cozy standard theory? Or is Parpola after all just a misinformed assyriologist? If there is even a grain of truth in Parpola’s arguments, much of linguistic history will have to be rewritten – and I will never dismiss my non-linguist friends’ factoids as funny misconceptions again.

An olympic olá!

I guess I’m not the only one to have succumbed to a bit of addictive late-night Olympics watching this weekend. The British Council has taken advantage of all this Rio enthusiasm to encourage us to learn Portuguese, as part of its campaign this year to get more Brits learning languages. Its survey found that only 2/5 of their sample of around 2,000 UK adults knew that Portuguese was the language spoken in Brazil. Thanks to some cousins who grew up there, I would have passed that one, but I’m not so sure about all the other South American countries (and they call me a linguist!). Of course, Spanish is a safe bet (and indeed, it is an official language in Argentina, Bolivia, Chile, Colombia, Ecuador, Paraguay, Peru, Uruguay and Venezuela), but what about Suriname and Guyana? That would be Dutch and English. And then there’s French Guyana (clue’s in the name).


But hang on, back in Brazil, Portuguese is only half the story. There are in fact 216 languages spoken in Brazil, many of them – 201 in fact – are indigenous, spoken there long before the Portuguese arrived in 1500. Take for example Apalaí, Arara, Bororo, Guarani, Nheengatu, Terena, Tucano, and Xavante. But there are also scatterings of other European languages, especially German (and a German variety, Pomerian) and Italian, and in some places these, and indigenous languages, now have co-official status.

Well, you might be thinking, I don’t mind picking up a few phrases through the viewing marathon of the next couple of weeks, but if I’m going to actually make a good stab at learning a language, is Portuguese really one to go for? According to some other research coming from the British Council, it actually comes in at number six in a list of ‘languages of the future’ – ones that will be most useful for the UK in the coming years. And Portugal is a popular holiday destination too (but remember, that the Portuguese spoken in Brazil and the Portuguese spoken in Portugal are their own varieties, with differences in pronunciation, lexicon, and grammar). Besides, the benefits of multilingualism seems to be everyone’s favourite topic these days, so perhaps we don’t need to angst so much about which language we learn but just get on with learning a language (more seriously, you may have seen both ‘bilingualism-is-the-cure-for-all-ills’ champions and ‘there-ain’t-no-difference-between-mono-and-bilinguals-after-all’ brigades, but this, as Thomas Bak at Edinburgh reminds us, reflects how many complicated factors are involved in multilingualism, in the life of any human, and is a reason to pursue more scientific research, rather than throw black-and-white eggs at each other. However, what can be said with some confidence is that, while speaking more than one language is the norm for a majority of the world’s population, where it is a choice, it often brings great social insight, cultural enrichment, and personal satisfaction.)

CC Cidade_Candidata

But back to Portuguese. Perhaps you’re not in the mood to trip over to Portugal and Brazil any time soon. But I was reminded of a wonderful infographic showing the second most common language (after English) at each tube stop in London. Looking at that, you can see that a trip to Clapham, Stockwell, or Willesden Junction would also give you an opportunity to try out your Portuguese. And then I’ve just read in this month’s Babel about a company that offers, among others, a ‘Brazil & Portugal Tour’… in London! So maybe the British Council’s suggestion isn’t such a bad idea after all.

“By the media criticizes you”?! Bei, ba, and light verbs in Chinese

The languages we speak are never without little surprises that remind us of the complexity and flexibility of our minds. For instance, when I was rewatching an old episode (originally aired in 2005) of my favorite TV show, 100% Entertainment, the other day, I heard one of the hostesses, Barbie Hsu, say:

(1) … hai    bei    mei-ti     ba     ni       ma     de   gou-xue-lin-tou.

———and   BEI    media   BA     you   scold  DE  dog-blood-pour-head

———“…and then you also get harshly criticized by the media.”

The sentence sounds alright, except that it includes both bei and ba. What’s wrong with that? Well, in Mandarin Chinese bei is used in passive sentences to introduce the active subject, and ba functions like a transitive/accusative marker on the object. Their uses are illustrated below (PFV = Perfective, ACC = Accusative):

(2)  a.   Gou-gou       chi        le         ping-guo.

———-puppy           eat       PFV     apple

———-“The puppy ate apple(s).”

       b.  Ping-guo         bei               gou-gou           chi       le.

———-apple              BEI               puppy             eat       PFV

———-“The apple(s) was eaten by the puppy.”

       c.  Gou-gou        ba              ping-guo          chi       le.

———-puppy           BA              apple              eat       PFV

———-“The puppy ate the apple(s).”

(2a) is the neutral way of reporting this event, where the subject is ‘the puppy’ and the object is ‘apple(s)’. In (2b), ‘apple(s)’ becomes the syntactic subject, while ‘the puppy’ is introduced by bei (similar to English by), thus rendering a ‘passive’ sentence. In (2c), the subject is again ‘the puppy’, but the object ‘apple(s)’ takes ba, which explicitly clarifies it as the affected patient of ‘eat’ and entails an agent subject (the eater in this case). That is, when the object is introduced by ba, the verb is unambiguously transitive and active.

So, taking these into consideration, Barbie Hsu’s sentence in (1) is literally something like:

(3)        “and then by the media harshly criticizes you.”

As you can see, it doesn’t work in English. But why is it okay in Chinese? Before answering this question, let’s look at some similar examples, which all suggest that the coexistence of bei and ba is not only okay, but also productive.

(4)    a.     Xiao-yang        bei           lao-hu   ba      tou        yao      xia-lai    le.

————-little-lamb       BEI           tiger      BA      head    bite      down   PFV

————-“The little lamb got its head bitten off by the tiger.”

         b.    Feiwen    bei         lao-shi     ba      man-hua-shu     mo-shou           le.

————-Feiwen   BEI         teacher   BA      comic book       confiscate        PFV

————-“Feiwen had her comic book confiscated by the teacher.”

         c.    Lianlian        bei       xiao-dao         ba       shou     ge-po                 le.

————-Lianlian        BEI     small-knife      BA      hand    cut-broken       PFV

————-“Lianlian got her hand cut by the small knife.”

This pattern is reminiscent of a construction in Japanese called the ‘suffering passive’, as in (5) (TOP = Topic, PASS = Passive, PST = Past, GEN = Genitive).

(5)     a.   Watashi-wa    imooto-ni                   tokee-o            kowas-are-ta.

————I-TOP              younger sister-by     watch-ACC     break-PASS-PST

———–“I got my watch broken by my younger sister.”

          b.  Watashi-wa     tonari-no        hito-ni         ashi-o         hum-are-ta.

————I-TOP              next-GEN       person-by   foot-ACC    step-PASS-PST

———–“I got my foot stepped on by the person standing next to me.”

Again we see the coexistence of an oblique (i.e. non-subject) agent and an accusative patient. Furthermore, in Japanese the passive voice is morphologically marked (-are-). Note that ‘I’ in (5ab) is not a subject, but the discourse topic. Intuitively this also seems to be true for the Chinese sentences in (4), whose English translations should be rendered more accurately as:

(6)    a.   “As for the little lamb, BEI the tiger bit off its head.”

——-b.  “As for Feiwen, BEI the teacher confiscated her comic book.”

——-c.  “As for Lianlian, BEI the small knife cut her hand.”

I have let BEI stay as such, because, despite the similarity, it is not an exact counterpart of English by (as used in passive constructions). Bei is historically verbal (meaning ‘to cover’), which is still reflected in some words and phrases today, as in (7).

(7)    a.  bei-zai, bei-nan “be covered by/suffer from disaster”

      b.  ze bei tian xia/wan shi “benefits cover the entire world/many generations”

Apart from the verbal meaning, bei also has a nominal meaning ‘cover’, which is well preserved today, as in bei-zi ‘quilt’, mian-bei ‘cotton quilt’, among others. Of course, in the ‘passive’ construction, bei no longer has these literal meanings. Nevertheless, it is plausible that it still retains some verbal properties. In other words, what matters here for bei is not the literal meaning, but the syntactic category. Belonging to the verbal category, bei can still join the verbal predicate quite freely and does not have to do this as a complement or adjunct (as the English by-phrase does).

In fact, the patient introducing ba has a similar status in Chinese. It also originated as a verb (meaning ‘to hold’) and has developed a usage as a bleached verbal category. Actually the verbal use of ba today is even clearer than that of bei, as in (8).

(8)    ba-jiu “hold the alcohol”, ba-men “hold the door”, ba-quan “hold the power”, ba-chi “hold and keep”, ba-shou “hold and guard”, ba-wan “hold and play with”, etc.

Elements like bei and ba are sometimes called light verbs, i.e. they are light in meaning but still belong to the verbal category and perform some verbal functions. Bei is a light verb that introduces the agent or instrument argument, while ba is a light verb that introduces the patient argument into the predicate. They can coexist in Chinese simply because they both exist in the language, as in (9a).

(9)   a. Chinese:  Hai    bei-mei-ti           ba-ni                ma de gou-xue-lin-tou.

———————–and   AGENT-media   PATIENT-you  scold harshly

b. English: And then youPATIENT also get harshly criticized by the mediaAGENT.

By comparison, since English does not have argument-introducing light verbs like bei and ba, when the verb is inflected in passive (i.e. when it lacks the accusative case assigning layer, which can be conceived as a null light verb), the original object (‘you’) must be licensed otherwise. In this case it becomes the ‘subject’, and the original subject (‘the media’) subsequently gets licensed in the prepositional by-phrase, as in (9b). In sum, we can say that bei and by have similar semantic functions but distinct syntactic categories, and that Chinese does not really have an English type passive construction.

Now you may wonder where the subject (or more exactly the ‘topic’) is in (10a). Well, it is simply omitted because its information is clear in the context. When you are scolded, the default topic is clearly YOU! Indeed, (10a) can be recovered as (11), which is exactly what Barbie Hsu means.

(11)  NiTOPIC    hai       bei      mei-ti     ba       ni          ma de gou-xue-lin-tou.

—— you            and      BEI     media     BA     you       scold harshly

Why Chinese systematically uses topics as part of the sentence whereas English does not is a whole other story. A relevant fact is that if we omit the topic elements in the Japanese examples in (5), the sentences are also well-formed, as in (12).

(12)    a.  Imooto-ni tokeeo kowas-are-ta.

————“(My) watch was broken by (my) younger sister.”

           b. Tonari-no hito-ni ashi-o hum-are-ta.

————“(My) foot got stepped on by the next person.”

Of course, bei and ba are not the only two light verbs in Chinese; there are still many more, such as yong (introducing an instrument), gei (a casual alternative of bei), jiang (a formal alternative of ba), and so on. Maybe if you join me to watch TV, you will notice Barbie Hsu and her sister Dee Hsu frequently use other light verbs as well. Language is our most faithful companion, and we can surely find out many marvelous facts about it if we pay a bit more attention and… study some syntax!

Five languages from Spain you never knew existed


Spain, known as the land of sol, siestas and sangría, is less well known for the diversity of its linguistic heritage. Though most people could probably identify Basque and Catalan as languages spoken in Spain, the Iberian Peninsula is home to a number of minority languages and dialects you’ve probably never heard of.

1. Galician
Even though it has 2.4 million native speakers and is Spain’s third most spoken language, it’s surprising how many people have not heard of Galician. That’s maybe because Galicia’s most famous residents – including current Spanish PM Mariano Rajoy, the 20th century writer Ramón Valle Inclan, and Spain’s late dictator Francisco Franco – are known for being Spanish, not Galician, speakers. Galician and its neighbour across the border, Portuguese, were originally one and the same language, Galician-Portuguese, a highly prestigious medieval language famous for its lyric poetry. Although Galician then became a low prestige language for many centuries, today it is co-official with Spanish in Galicia and has its own publicly-funded television channel.

2. Aragonese
Aragon, the land of Henry VIII’s first wife and the 18th century painter Francisco de Goya, is also home to the luenga aragonesa, or Aragonese language, which descends from the now extinct medieval language Navarro-Aragonese from North-East Spain. Aragonese has a core of native speakers in Aragon’s remote Pyrenean villages, but is understood by many more people in the surrounding areas, and is mutually intelligible with neighbouring languages such as Castilian Spanish and Astur-Leonese. Aragonese is protected by local laws, and has its own language academy, but, like many of Spain’s minority languages, is still considered endangered by UNESCO.

3. Judeo-Spanish
Judeo-Spanish, also known as Ladino, is a language from Spain that hasn’t been spoken in Spain since 1492, when the Jewish population was expelled from the country by the Spanish monarchs. Though since 2015 their descendants have been able to apply for dual Spanish citizenship, Judeo-Spanish is now mostly spoken in Israel, Turkey and Greece. Because the last time it was used in Spain was over 500 years ago, Judeo-Spanish is a linguistic time capsule, and sounds more similar to Medieval Spanish than modern Spanish. It’s also the only Spanish language to be written in Hebrew script.

Bilingual Spanish-Leonese roadsign (Photo: Iván Martínez Lobo)
Bilingual Spanish-Leonese roadsign (Photo: Iván Martínez Lobo)

4. Leonese
One half of the Astur-Leonese language branch, Leonese descends from the everyday Latin spoken in the geographic area that would become the medieval Kingdom of León. The kingdom’s capital, also called León, or Llión in Leonese, was founded as a military camp and settled by the Roman Seventh Legion. León/Llión, which means ‘lion’ in today’s language, actually comes from the Latin name of the capital’s founders, legio septima gemina, meaning ‘the twin seventh legion’. Despite a flourishing medieval literature, history has not been kind to the llingua llionesa, which is now a UNESCO endangered language that, unlike its other half Asturian, has no official status in Spain.

5. Aranese
Less well known than its sister dialects of Gascon and Occitan in France, Aranese is spoken in the Valley of Aran (Val d’Aran), one of only two areas of Spain on the Northern side of the Pyrenees. Even though it only has around 3,000 native speakers and is used in a small geographical area, Aranese has co-official status in its home region, Catalonia, and is taught as a compulsory subject in schools in the Val d’Aran. You can even pick up some Aranese yourself thanks to the University of Barcelona’s multilingual conversation guide.

This blog post was previously published at the Huffington Post.

Fiction favourites for die-hard linguists

With summer supposedly at its height (calling all semanticists: is the fact that it is July enough to meet the definition of ‘summer’, even if rain, hail, wind, and multiple layers of clothing are featured?) it is time to put down that book on allophony in Gujarati, the processing of possessives in Chhattisgarhi, or whatever linguistic page-turner you’re dipping into. Quite often though, a dedicated linguist cannot stop making language-related observations even when switching off from full-blown research-mode. As a consequence, linguistics literature is generously sprinkled with references to novels, the authors of which were at the time of writing gloriously unaware of their work turning into scientific data. I bring to you the top three works of fiction for linguists.

First up we have Lewis Carroll’s classic Alice’s Adventures in Wonderland – an inspiration for film makers, fantasy lovers, and keen readers of all ages, but also a pragmatist’s wet dream. As Alice plunges down the rabbit hole, she ends up in a world defying the rules of physics and also in something of a nonsensical linguistic wonderland. The Mad Hatter’s tea party is a celebration of not only unbirthdays but also of linguistic rule bending and pragmatic acrobatics. Take the exchange between the March Hare and Alice:

‘Take some more tea,’ the March Hare said to Alice, very earnestly.
‘I’ve had nothing yet,’ Alice replied in an offended tone, ‘so I can’t take more.’ ‘You mean you can’t take less,’ said the Hatter: ‘it’s very easy to take more than nothing.’

The reason for Alice’s confusion is that we tend to take ‘more’ to imply that something has already happened; however, as the March Hare points out, technically more is more even if it’s just more than nothing. Pragmatics brain pain anyone?

Having a very non-pragmatic tea party

Having a very non-pragmatic tea party

If you aren’t feeling the strain of mad tea party communication, perhaps you would enjoy some even more strenuous intellectual effort in the form of James Joyce’s Finnegans Wake. The epic stream on consciousness is written in a mixture of real English words, neologisms, and portmanteau, or blended, words. As such, it’s one for morphologists and phonologists; by putting bits and pieces of the right sound combinations together and throwing in some actual endings, you can get something apparently nonsensical, yet very much English-like.

“Loud, heap miseries upon us yet entwine our arts with laughters low!”

Joyce’s tour de force really is a literary gem, but somewhat on the challenging side of summer reads. As my friend Wikipedia kindly puts it, “Finnegans Wake remains largely unread by the general public.”

Very much the opposite is the case with – *drumroll* *gasp* – The Lord of the Rings. Okay, I know many of us got into the trilogy by staring into the dreamy film-version eyes of Legolas, drooling at the hunky figure of Boromir, or (despite female characters being few and far between, my inner feminist would like to point out) admiring the airy wardrobe of the mysterious Galadriel. But as die-hard fans will know (and who will probably roll their eyes at me pointing out the obvious), Tolkien was something of a language geek, and this shows throughout the linguistic landscape of Middle Earth.

"Really enjoying the Sean Bea... sorry, linguistic aspects."

“Really enjoying the Sean Bea… sorry, linguistic aspects.” Credit: Jason Parrish.

One of the author’s passions was Finnish, as he wanted to read the Finnish national epic Kalevala in the original. I don’t know if Tolkien ever managed to conquer Kalevala without the aid of translations and dictionaries, but his Elvish language of Quenya was certainly inspired by the sounds of Finnish: Mindon Eldalieva (‘Lofty Tower of Elvish-people’) and Oron Oiolosse (‘Ever Snow-white Peak’) resemble Finnish just as Finnegans Wake resembles English. There’s also a healthy dose of Old English squeezed into proper names: Saruman derives from the root searu- (‘treachery’ or ‘cunning’), while Mordor is rather morbidly based on morthor (‘murder’).

And on that cheerful note, I wish you happy linguisticky reading!

In a manner of speaking

Way way back many blog posts ago, I wondered why some pragmaticians have been so obsessed with eating cookies. Well, not exactly, but they have spent a lot of time investigating utterances like:

Ben ate some of the cookies.

Some pupils failed the exam.


On a standard view, these utterances literally mean something like “Ben ate some and possibly all of the cookies” and “Some and possibly all pupils failed the exam”, but in the right context, the hearer infers the speaker’s intended meaning that “Ben ate some but not all of the cookies” and “Some but not all pupils failed the exam”. These implicated meanings are known as scalar implicatures (‘implicature’ being a technical term coined by Paul Grice for non-deductive implications beyond the literal meaning of what a speaker says, based on assumed principles of co-operative conversation). That’s because the key word in the utterance, here ‘some’, belongs on a scale with some alternative word that the speaker could have said but didn’t (like ‘all’).

And we can think of other examples like:

The coffee is warm
+> but not hot.

The concert was good
+> but not excellent.

There are loads of reasons why pragmaticians (and especially the experimental sort) have concentrated on the ‘some but not all’ implicature: it’s easy to depict visually, it’s pretty consistent across contexts (or without much context – good for controlled but not very natural experiments); it’s easy to make nice balanced stimuli by just changing one word, and so on. However, what we’re now learning is that ‘some’ is perhaps not so representative of scalar implicatures after all1. And if we can’t even generalise from ‘some’ to scalar implicatures, what about quantity implicatures (of which scalars are a subtype) or other kinds of implicature, manner and relevance?

Given the apparent dearth of research on manner implicatures, I decided to do some investigating myself. Now, manner implicatures arise when speakers some marked form to convey a marked meaning; an unconventional phrase to express that what they’re describing is not a stereotypical instance. Grice’s own example was:

‘Miss Singer produced a series of sounds corresponding closely to the score of an aria from Rigoletto

The idea is: why did the speaker go to such lengths, when they could have just said ’sang’? It’s because the singing was in some way not stereotypical – probably downright awful!2

Here are some other potential examples:

Ben constructed a pile of bricks and mortar.
+> Ben built a wall, if you can call it a wall.
(Otherwise the speaker would have said “built a wall”)

Mary caused the car to stop.
+> Mary stopped the car in some unusual way (e.g., pulling the handbrake, driving into a tree…)
(Contrast with “stopped the car”)

Terry put the duvet and pillows on top of the bed.
+> Terry made the bed, but messily.
(With the alternative “made the bed”)


Now, why have these inferences received so little attention? One possibility is that they’re not really a definable category on their own, but really a motley bunch of quantity implicatures, conventions and other stuff (as, for example, Horn would have it3). However, the fact that what is important here is the form of the utterance rather than the content (the lexicon and syntax, not the semantics), means that they are in some ways distinct, and at least in principle worthy of research in their own right (as Levinson, 2000, thinks) – even if, in the end, we find out they’re not so interesting after all.

Another possibility is that they’re hard to investigate. This has certainly been my own experience. It’s hard to think up examples, and it’s almost impossible to search corpora for them, except for the most conventionalised of cases. They seem to be rare and somewhat precarious in real life conversation, and when you do try to test them out, people seem to have very varying degrees of sensitivity to them.

This could give us pause for thought and suggest that maybe they are not a distinct pragmatic phenomenon after all. However, perhaps it’s not surprising that they don’t lend themselves to the normal tools of experimental pragmatics, like acceptability judgement tasks and picture matching tasks, which tend to rely on participants’ intuitions about isolated utterances. Depending, as they do, on the speaker’s choice of words and grammar – on how she communicates her meaning, not just what she communicates – they may rely on a greater degree of knowledge of language and its conventional use, or at least on a greater degree of confidence in that knowledge. They are likely to be extremely variable depending on the linguistic context: is it formal? is there jargon? is the speaker a native or second language learner? does the speaker have their own unusual style? They are likely to be cued with intonation or hedges or discourse markers (“Well, he constructed a pile of bricks and mortar”). This means that in an unnatural experimental context (like choosing a picture that matches what an utterance), participants may not be confident enough about any inference they do make, or that they don’t make any manner inference without those extra cues and information about the speaker.

I’ve found some evidence that some adults are sensitive to some manner implicatures, but I’ve no show-stopping conclusions yet. So if you think you’ve made a manner inference recently, then do give me a shout!

1 Van Tiel, Bob, Emiel Van Miltenburg, Natalia Zevakhina & Bart Geurts. 2014. Scalar diversity. Journal of Semantics.

2 As my supervisor pointed out, the example actually works better as a case of manner without ‘closely’, otherwise you could just get a scalar implicature of ‘not exactly’.

3 Horn, Laurence. 2008. Implicature. In Gregory Ward & Laurence Horn (eds.), The Handbook of Pragmatics, vol. 26. John Wiley & Sons.

“One please”, and language attrition

My boyfriend, a huge fan of Japanese anime, once recommended an old anime series to me called Strawberry Marshmallow (ichigo mashimaro, 苺ましまろ), which is about the life of a group of Japanese schoolgirls. Although I watch anime now and then, its style is not my type, so I thought about giving up after finishing the first episode. But later, when I had reluctantly proceeded to the second episode, I suddenly decided to continue – not because the story was good, but because of one notable character and, well, the linguistic phenomenon behind her.

Ana Coppola, the character in question, moved to Japan with her family when she was six. Before that, she was born and raised in Cornwall, and English was the only language she was exposed to, so, by definition, she was a native speaker of British English. When her family lived in Japan, she enrolled in a local primary school and began to take courses and speak to classmates in Japanese; her parents (who were definitely native speakers of British English by definition, but I know that the anime studio could only hire Japanese voice actors) also spoke to her in Japanese at home, maybe to encourage her to use her second language. At the start of the story, she has been living in Japan for five years, and she can speak Japanese fluently – even over-fluently, since she uses words and sentences that other girls of her age never use. At the same time, her English becomes a mess: she needs to take the same English course with her classmates, and some of her friends even outperform her. For example, in the anime, when she is asked to introduce herself in English, she goes as far as to say “one please”, which is a word-to-word translation of ひとつよろしく (hitotsu yoroshiku, more or less like “it’s my pleasure to meet you this time”). It seems that, due to the years she has spent using Japanese as her dominant language at both home and school, Ana has finally lost her first language, and in the story she could not be qualified as a ‘native speaker of English’ anymore, even if she always claims that she is from Cornwall.

Well, you say it.

Well, you say it.

Ana’s case is extreme. After all, it is just an anime show and the producers always want to pursue some dramatic effects, and we can see it from the language Ana uses at home – how could a British migrant family suddenly begin to speak Japanese at home? Milder examples, however, are rather common in real life. We always believe that, once we acquire a language and can use it fluently, especially our mother tongue, we cannot forget it. However, if we do not use the language for a long time, we may come across certain difficulties when we pick it up again. In a word, one may feel ‘clumsy’ when using a language that one can manage but rarely uses – that is exactly the word used by Aneta Pavlenko when she first systematically investigated this phenomenon in 2003. That can happen to one’s first language if one lives in another linguistic environment for years, or, sometimes to one’s second language if one has moved back home for years. I have received questions from people complaining about their inability to speak their first language ‘naturally’: some of them forget the intended words in their first language and are forced to switch to their second language (which is called code-switching, and I have discussed it here), while some others start to use the structures that are available in the second language but not the first language. Even I myself experience these symptoms now and again. While the use of a language is diminished, it is suppressed and gradually ‘worn out’. The phenomenon is called ‘language attrition’. Recent research in bilingualism focuses particularly on the attrition of the first language of immigrants, but there are also studies that investigate the mechanism and phenomenon of attrition of a second language of multilingual people.

Attrition can happen in different aspects of language, including vocabulary, sentence structure, and pronunciation. I have described the first two aspects in the previous paragraph: the difficulty of word selection and misuse of ‘false friends’ (words with different meanings that are pronounced similarly in two languages) are reflections of lexical attrition, and the blending of sentence structures in two languages can be seen as an instance of syntactic attrition. One possible reason of attrition, according to previous research, may be the higher cognitive load that multilingual people experience when they process language: compared to their monolingual peers, multilingual speakers manage more lexical items and more complicated structures in two or more languages, and they may need additional time and cognitive effort to select the lexical items or the syntactic elements that not only match their intentions but are also consistent with their current language. Therefore, they seem to be slowed down when processing the less used language.

While the first two aspects, vocabulary and sentence structure, are prominent in both L1 and L2 attrition, pronunciation appears less frequently in the research on L1 attrition. It seems that one can still preserve the pronunciation of one’s mother tongue even if one shows problems in sentence construction or lexical selection. The attrition of pronunciation in L2 is more interesting: the L2 learner might be able to pronounce their second language in a way that is closer to native speakers of that language when they are in the L2 environment, but they will gradually change their accent after returning to the L1 environment, developing a kind of foreign accent. I have observed this change in the Japanese singer Keito Okamoto: he studied in the UK between ages 9 and 13 and is able to speak English fluently still today, but his English accent is now a blend of British and Japanese, with the Japanese features having gradually become more obvious.

(Well, I bet his accent was definitely not like this when he was 12.)

The other day I had a discussion with a good friend at Edinburgh about the possible relation between first language attrition and second language acquisition (particularly adult L2 acquisition), since her topic is related to the former and I focus mainly on the latter. If we look into the most superficial level of the two phenomena, i.e. the performance of a L2 learner and a L1 attritor, we can see that they both receive influence from another language they already know: for the L2 learner, that is their first language, while for the L1 attritor that is their second language. Therefore, from a macro perspective, we can connect L1 attrition and L2 acquisition: both of them reflect the role of cross-linguistic influence in multilingual people’s language ability, and we can even make some predictions of L1 attrition based on the established results of the research on L2 acquisition.

However, L1 attrition and L2 acquisition are essentially different because their internal mechanisms are opposite. The process of L2 acquisition is ‘from nothing to something’, and it happens at the level of both competence and performance. The most advanced adult L2 learners still cannot possess the intuitions that are generally available to native speakers of that language. The process of L1 attrition, on the other hand, is ‘from everything to something’; although we can observe how attritors’ performance changes, their knowledge of the language, as well as their competence of using the language, does not change significantly. Studies have shown that L1 attritors can recover their performance if they stay in the L1 environment for a period of time, which means that they have preserved their native speaker intuitions, and what is influenced is only the performance.

One particular point about language attrition I believe is worth mentioning is that it does not receive the attention it deserves – when I say it, I mean the attention from ordinary people, since more and more applied linguists have begun to look into the phenomenon. Similar to code-switching, people without proper linguistic knowledge often show prejudice and bias when they hear about language attrition. They can hardly believe that one can sound ‘unnatural’ when speaking one’s first language after using one’s second language for decades, and sometimes unfriendly people may call these immigrants ‘traitors’ or ‘pretenders’, which I have occasionally observed. Knowing more about language attrition can help us not only understand better the human cognitive system and its ability to language learning, but also reduce prejudice. If I have convinced you that ‘we can lose our language, even if it is our mother tongue’, congratulations! Now you know a bit more about how language works in our brain.


Special thanks to Wenjia Cai and Maki Kubota @ Edinburgh!

To get a professional overview of first language attrition: (Professor Schmid is one of the leading scholars in the area of first language attrition.)

For more details about L1 and L2 attrition:

De Bot, K., & Weltens, B. (1995). Foreign language attrition. Annual Review of Applied Linguistics, 15, 151-164.

Pavlenko, A. (2003) “I feel clumsy speaking Russian”; L2 influence on L1 in narratives of Russian L2 users of English. In Cook, V. (ed) Effects of the second language on the first. Clevedon, UK: Multilingual Matters, pp. 32-61.

Schmid, Monika S., Barbara Köpke, Merel Keijzer and Lina Weilemar. 2004 (eds). First Language Attrition: Interdisciplinary Perspectives on Methodological Issues. Amsterdam/Philadelphia: John Benjamins.

Breaking news: Guy who learned Japanese from girlfriend speaks like a girl

The life of a Japanese learner is not an easy one: you’re faced with not one, but three, non-Roman writing systems, an array of politeness forms, and freaky word order options. To top it all off, the language learning community swarms with warning examples of how to make a fool of yourself by not only making simple grammatical mistakes but also *Psycho tune* using the language of the wrong gender. The stories of ‘Guy who learned Japanese from girlfriend now speaks like a girl’ or ‘Girl shunned for using male language’ could make Daily Mail headlines were the publication more linguistically inclined.

... and speak accordingly! Image credit: Beth Granter.

… and speak accordingly! Image credit: Beth Granter.

Although reality isn’t quite as much of a minefield as a cheeky Google search for ‘why is learning Japanese so difficult?’ might suggest, gendered language is a very real phenomenon in Japanese. Gendered language is nothing grammatical in this case, and is separate from grammaticalised aspects such as gender-specific or neutral pronouns: if you use a form strongly associated with the opposite gender, your utterance won’t be deemed ungrammatical, just weird or out of place. Rather, it refers to gender roles and ideologies of what female and male speech should sound like; very broadly, female language tends to be more submissive and gentle, male language being more direct. Indicators of gender are scattered throughout the language, showing up in choices of words, interjections (things like oh, uhm), directives (i.e. commands, requests, and questions), pronunciation, and so on, but most prominently in the choice of sentence endings and pronouns referring to ‘I’ and ‘you’.

Take sentence endings first. Japanese is full of particles – a bit of a dustbin category for little word-like elements that don’t always mean much on their own – and many of these appear sentence-finally, expressing things like questioning or affirmation: think ‘this is nice, isn’t it’ type of things. One of the most clearly gendered expressions here is wa: as a sentence-final particle, it indicates the femininity of the speaker. A girl would typically say takai-wa (‘tall’), but the same utterance for a boy would be ridiculed as effeminate, the socially prescribed option being plain takai. On the opposite end of the scale is zo indicating new information and used exclusively in male speech. It is considered informal and even rude, mirroring the directness ideologically associated with male speech. A gender-neutral way of expressing a similar meaning is the particle yo.

Where things get slightly more puzzling for a Western learner is the proliferation of words referring to ‘I’ and ‘you’. Some of them relate to degrees of politeness and differences in social status, but many of them encode additional aspects of gender. Gender-neutral choices are exemplified by the Japanese class favourite watashi ‘I’. Typically feminine pronouns are again perceived as softer and gentler; these include atashi, atakushi, and uchi, while typically male pronouns feature boku and ore. As for referring to ‘you’, male forms tend to be more direct – kimi, omae, anta. Feminine counterparts encode a greater degree of politeness, so that a typical form of address comes in the form of the pronoun anata followed by the addressee’s name or title and a socially appropriate marker.

Boku? Wabash? Ore? Who am I? Image credit: myrealnameispete.

Boku? Wabash? Ore? Who am I? Image credit: myrealnameispete.

Forms differ, then, but whether there is a yo or a zo at the end of a sentence doesn’t say much in itself to a non-Japanese aficionado. Where things get interesting is when these funny little word forms are considered in the broader social context (cue gender studies students!).

Slightly archaic as it may sound with its submissive feminine and direct masculine forms, gendered language as it is conceived of today is in fact a relatively recent innovation. This goes also against the popular depiction of gendered language as an ancient innovation, a case of this-is-the-way-it-has-always-been. Although differences in male and female speech have been recorded earlier as well (and this is not surprising; even in languages like English where ’gendered language’ is not made into a big deal for learners, speakers will think, perhaps unconsciusoly, of certain ways of speaking as typically feminine or masculine), gendered language proper kicked off after the start of the Meiji era (from mid-19th century). Something of a celebrity among Japanese linguists, Orie Endo compared two literary works, Ukiyoburo from 1813 and Sanshiro from 1909 to show the timescale the modern gender differences emerged along. In the earlier text, the differences in speech patterns reflect social status, but not gender, while in the later one gendered differences have clearly emerged.

Of course, particles, pronouns and the like don’t just turn into carriers of gendered meanings in a vacuum: as always in language change, there is a human component. At the start of the Meiji era, schoolgirls, as teenagers so often do, came under criticism from societally higher-up men for speaking ‘improperly’, in ‘vulgar’ or ‘unpleasant’ way (déjà vu? My earlier post on be like, might, like, bear like a resemblance to this). But as sometimes happens to schoolgirls, they grow up and take on positions of role models. At the time, there was an ideal of ryoosai kenbo ’good wife, wise mother’ hanging around that was supported by the government and featured in women’s magazines, written about by the very schoolgirls in the very language they had been criticised for. The form of language became associated with the ideal middle class and was therefore something to aspire to; and voilà, the parlance of vulgar schoolgirls had become the new vogue.

Babbling away in improper Japanese. Image credit: Danny Choo.

Babbling away in improper Japanese. Image credit: Danny Choo.

The establishment of the new feminine language, or onna kotoba, was further propelled by reactions to the rapid modernization and westernization processes that Japan was undergoing: the nation needed traditions to hold on to, and gendered language made Japanese conveniently unique compared to the incoming western influences.

That is not to say that after its establishment gendered language has become inert to change. Quite on the contrary, recent developments see female and male speech losing their distinctness. Young women have been reported to have stopped using feminine speech in favour of more neutral or even masculine language, with teenage girls taking over traditionally male pronouns such as boku and ore. Some male forms are taking on a function of female empowerment: miki ’you’, usually used by men to close women friends, is now also used by women to talk down to men. The linguistic changes can, again, be tied to cultural shifts: more women than ever before are now delaying marriage and pursuing careers, and a speech form intended to convey submissiveness does not fit well with this emancipation of sorts. Interestingly, self-defining male speakers are not taking on features of female speech, and this would seem to be so engrained into the gendered mindset that it does not happen even in soliloquy, or speaking alone. So, while women happily use masculine forms even when blabbering alone, men don’t use feminine forms in the same way. This, some would argue, reflects the greater value associated with the masculine gender image in social hierarchy.

It's (linguistic) emancipation time! Image credit: DonkeyHotey.

It’s (linguistic) emancipation time! Image credit: DonkeyHotey.

But with that, I’m treading into non-linguistic waters. So, students of gender studies, rejoice – if you took in anything of the above, you are sorted for research topics.

Students of Japanese, on the other hand, relax – gendered language is becoming less and less of an issue for your learning process, and your gender-mismatched speech is unlikely to make a headline.


I said this was a hot topic, and the internet in particular is full of thrilling reading. I’ve drawn inspiration, examples, and information from Tofugu, Oxford Dictionaries, The Japan Times, LinguaLift, and Japanese – a linguistic introduction by Yoko Hasegawa.


Syntactic Islands

Last week’s post on movement highlighted just how useful it can be to think of elements in a sentence being able to move to different positions.

One of the really interesting things about movement is that it seems to be unbounded. In other words, there are apparently no bounds to how far an element can move (I say seems and apparently because there is a lot of evidence to suggest that the situation is far more complex. However, I’ll ignore those details here). We can see this unboundedness in so-called wh-movement (it’s called wh-movement because the moving element undergoing this type of movement typically begins with the letters wh– in English, e.g. who, what, where etc.). In (1b), the wh-phrase what is interpreted as the direct object of the verb see. Since direct objects in English normally follow the verb, as in (1a), what is also thought to originate in this position (I’ll indicate this original position with what in strikethrough, indicating that it is not pronounced).

(1) a. You saw something

b. What did you see what?

The interesting thing is that what can appear arbitrarily far away from its original position.

(2) a. What did you see what?

b. What did he say that you saw what?

c. What did she think that he said that you saw what?

d. What did they believe that she thought that he said that you saw what?

e. …

However, the story is much more complicated and interesting. In his 1967 PhD thesis, John Robert ‘Haj’ Ross identified various syntactic ‘islands’. Syntacticians generally take ‘islands’ to be units of structure that elements cannot escape or move from.

We saw in (2) that a wh-phrase can apparently move as far away from its original position as it wants. But now consider the following sentence:

(3) a. I met the man who saw a ghost.

b. I visited the house that you saw a ghost in.

The examples in (3) contain relative clauses (surprise, surprise! See my other posts) – who saw a ghost is a relative clause modifying the noun man in (3a), and that you saw a ghost in is a relative clause modifying the noun house in (3b). In (2), we attempted to move a wh-phrase which originated as the direct object of the verb see. As we saw, the result was a well-formed English sentence. So let’s try to do the same thing with the examples in (3).

(4) a. *What did I meet the man who saw what?

b. *What did I visit the house that you saw what in?

The examples in (4) are crashingly bad English sentences (hence the *)! In fact, if I’d put these sentences at the beginning of this post, you’d probably be wondering what on earth I was trying to say. But what’s wrong with them? What’s the difference between the examples in (2) and the examples in (4)?

As Ross observed, the problem with (4) is the relative clause. The relative clause seems to be an island, i.e. wh-phrases cannot escape from it.

There are other types of island beside relative clauses. Consider the example in (5) which involves two conjoined (or co-ordinated) direct objects.

(5) a. You saw a ghost and a monster.

b. *What did you see what and a monster?

c. *What did you see a ghost and what?

As (5b) and (5c) show, we cannot move out of co-ordinate structures (Ross called this the Co-ordinate Structure Constraint).

Relative clauses and co-ordinate structures seem to be very strong islands, i.e. if we attempt to move an element out of such islands, the result is very bad (given how much I’ve worked on relative clauses, I’m in two minds about whether I’m stuck on them because they are strong in the sense of an island paradise which you never want to leave, or in the sense of Alcatraz!).

Other structures seem to be weaker islands, i.e. we can move elements out of them, but the result is not quite fully acceptable (this is marked with a ? at the beginning of the example). An example of a weak island can be seen in (6b) (compare it to (6a), which does not contain an island).

(6) a. What do you think that I saw what?

b. ?What do you wonder whether I saw what?

The island effect seems to come from the fact that we are trying to move an element out of a subordinate clause beginning with whether. Similar effects are found with subordinate clauses beginning with how, where, who(m), what. They are thus called wh-islands because these islands are introduced by elements typically beginning with wh– in English.

(7) a. You asked how I fixed the car?

b. ?What did you ask how I fixed what?

Although it has been nearly 50 years since Ross first identified his ‘islands’ (and there are many more that I have not mentioned), they continue to pose problems for syntactic theory. A major step was to identify the islands in the first place. This shows how important it is to consider not only what languages can do, but also what they can’t (there’s also the massive question about how we intuitively know that sentences such as those in (4) and (5b,c) are bad). The next step was to understand what makes an island an island (and whether all islands are in fact alike). We can list them and classify them as strong or weak, but ideally we’d want to know why these structures are islands and not others. Attempts have been made (notably by Chomsky (1973), see also the recent overview of the issues by Boeckx (2012)) but the problem still remains.


Boeckx, C. (2012). Syntactic Islands. Cambridge: Cambridge University Press.

Chomsky, N. (1973). Conditions on Transformations. In Anderson, S., & Kiparsky, P. (eds.) A Festschrift for Morris Halle (pp. 232-286). New York: Holt, Rinehart & Winston.

Ross, J.R. (1967). Constraints on Variables in Syntax. Doctoral dissertation, MIT.