What is the that-trace effect?
In English, the subordinating conjunction that is often optional.
(1) You think that John kissed Mary.
(2) You think John kissed Mary.
(1) and (2) are both acceptable sentences in English: that is present in (1) but absent in (2).
When we ask a question about an element inside the subordinate clause, that usually remains optional, as in (3) and (4). Note how who(m) appears in sentence-initial position. However, we still intuitively feel that, in this particular example, it is the direct object of kissed. Since direct objects in English follow the relevant verb (Mary follows kissed in (1) and (2)), we can capture this intuition by putting a trace of who(m), represented as twho(m), in the position just after kissed.
(3) Who(m) do you think that John kissed twho(m)?
(4) Who(m) do you think John kissed twho(m)?
However, there are instances when that is not optional. When we ask a question about the subject of the subordinate clause (corresponding to John in all the examples so far), that must be absent (* means that the sentence is unacceptable).
(5) *Who do you think that twho kissed Mary?
(6) Who do you think twho kissed Mary?
The unacceptable configuration involves that followed immediately by a trace, hence this effect is called the that-trace effect (Perlmutter, 1968).
Why is the that-trace effect interesting?
The that-trace effect is interesting in a number of respects, but I’ll just mention two of them. The first is the question of how we, as English speakers, come to ‘know’ that there is a contrast between (5) and (6) given that that is generally optional as we saw in (1) and (2), and (3) and (4). Unless you’ve studied syntax, you’ve probably never been explicitly taught that there exists a that-trace effect in English at all. So how do we learn such an effect? Phillips (2013) looks at how frequent examples like (3-6) are in a corpus of speech directed at children. This is what he found (Phillips, 2013: 144):
a. Who do you think that John met __? 2 / 11,308
b. Who do you think John met __? 159 / 11,308
c. *Who do you think that __ left? 0 / 11,308
d. Who do you think __ left? 13 / 11,308
The corpus contains 11,308 examples of wh-questions (i.e. questions involving the wh-phrases who, what, etc.). Out of the 11,308 examples, there were no examples of the form in (7c), i.e. cases where the subject of the subordinate clause is questioned. This is the configuration that English speakers judge unacceptable. What is particularly interesting is (7a). Out of the 11,308 examples, there were only two tokens where that is present and the direct object of the subordinate clause has been questioned. Yet speakers judge such sentences as acceptable. If examples like (7a) are so rare, why don’t speakers hypothesise that (7c) just happens to be very rare as well? Alternatively, given how rare it is to find that in wh-questions, why don’t speakers hypothesise that that is generally impossible in wh-questions? Either way, it is quite difficult to see how the contrast between (5) and (6) (or (7c) and (7d)) can be acquired purely from child-directed speech. We thus hypothesise that there is something about the way the syntax (of English) works that allows us to ‘know’ about the that-trace effect. This is a classic argument based on the poverty of the stimulus.
The second point of interest comes from the fact that English has a that-trace effect as well as an anti-that-trace effect. The anti-that-trace effect can be seen in relative clauses. In English, we can form relative clauses using that. In general, that is optional in relative clauses just as it is in (1-4) above (we use traces again and the relative clause is in boldface).
(8) The woman that John kissed twoman is called Mary.
(9) The woman John kissed twoman is called Mary.
In (8) and (9) we have relativised a direct object; woman is interpreted as the direct object of kissed inside the relative clause.
Now, if we relativise a subject, that is no longer optional. In such cases, that is obligatory.
(10) The man that tman kissed Mary is called John.
(11) *The man tman kissed Mary is called John.
Once again there is something special about the relationship between that and the subject of the subordinate clause. However, the effect in (10) and (11) is the exact opposite of the that-trace effect seen in (5) and (6)! As seen in (5), that immediately follow by a trace is unacceptable; that must be absent, as in (6). In (10) and (11), the situation is reversed. As seen in (10), that immediately followed by a trace is acceptable; the absence of that results in unacceptability, as in (11). We thus call the effect in (10) and (11), the anti-that-trace effect.
The problem for us, then, is that there is something about the syntax of English that allows us to ‘know’ that the that-trace effect exists, but which also allows the existence of its opposite, the anti-that-trace effect. The challenge, which I am working on at the moment, is to find out what that something is!
Perlmutter, D. M. (1968). Deep and surface structure constraints in syntax. Doctoral dissertation, MIT.
Phillips, C. (2013). On the nature of island constraints II: Language learning and innateness. In J. Sprouse & N. Hornstein (Eds.), Experimental Syntax and Island Effects (pp. 132–157). Cambridge: Cambridge University Press.
A couple of weeks ago my college held a graduate symposium, and it was a broad-ranging and very interesting day, with presentations from computer science, psychology, physics, classics, and even a ‘three-minute thesis’ on lucid dreaming. A colleague of mine gave a talk entitled, ‘Telling time: time in the worlds’ languages’ (the theme was time – a linguists’ bounty!), in which he gave us a whistle-stop tour of languages with grammatically encoded aspect1, languages with three tenses (like English), with just two tenses (nonpast and past, or non-future and future), and with no tenses at all. Languages like Chinese varieties don’t have any grammatical inflection to indicate tense (like we have in English, by adding -ed to regular verbs in the past tense, for example).
In the question time, various members of the audience voiced surprise at how this might be. How can speakers of languages without tenses talk about time? How on earth do they get along? The intuition here, nurtured by our own linguistic experience, is that the ambiguity of tenseless sentences would be unsurmountable. If someone hears a sentence like ‘John eat cake’, which could mean John eats cake, John will eat cake, John ate cake, John had eaten cake, and so on, how are they to know which meaning is intended?
This raises the question, just how rife is ambiguity in language? And is it all that terrible anyway? Of course, in fine rhetoric, technical writing, and especially legal language, there is good reason to avoid ambiguity. While in other corners of language, such as poetry, there is reason to embrace it:
Never seek to tell thy love
Love that never told can be
For the gentle wind does move
(William Blake, referenced by Grice, 1975)
In lexical semantics, we try to carefully distinguish ambiguity from polysemy and vagueness (although you can probably think of examples where they mesh).
And when you start looking, ambiguity is all around you. Besides lexical ambiguity (some more examples include ‘run’, ‘port’, ‘rose’, ‘like’, ‘cleave’, ‘book’, and ‘light’), there’s syntactic ambiguity, where the structure of a sentence allows for more than one interpretation, of which some infamous examples are:
Visiting relatives can be boring.
The chicken is ready to eat.
And semantic scope ambiguity, like this:
Every Cambridge student has borrowed some books.
> every Cambridge student has borrowed some (or other) books
> there are some (particular) books that every Cambridge student has borrowed.
So it is beginning to look like (accidental) ambiguity may not be such a rare thing. But isn’t this a deficiency in our linguistic system? If hearers are constantly having to work out what speakers mean, isn’t that a lot of effort on their part? Well, I think that kind of worried response assumes that speakers and hearers are just like transmitters and decoders. Or, to put it another way, it follows the ‘mind as computer metaphor’ – programming languages do not contain ambiguity, after all. But that’s precisely because machines that parse code are rather different from sophisticated human minds that are able to make subtle and fast inferences about speakers’ intentions. (And, besides, this ambiguity doesn’t seem to have stopped us communicating pretty well so far).
Today I read an intriguing article by Piantadosi, Tily & Gibson which sets out two reasons why ambiguity is actually expected in a communicative system like language. Firstly, given that words and phrases occur in a context which is itself informative (the preceding discourse, the social and world context, the speaker and hearer’s background knowledge), disambiguating information encoded lexically could actually be redundant, and an efficient language will not convey redundant information. Secondly, they follow Zipfian principles that suggest that ambiguity may arise from a trade-off between ease of production (lazy speakers who want to say the minimum) and ease of comprehension (lazy hearers who want maximum clarity and minimum work of interpretation). But, importantly, the fact that it seems that production – articulation of utterances – seems to be ‘costly’ (whatever that means in terms of physical / psychological processes), while inference – interpreting potential ambiguity – seems to be relatively cheap, means that where an ‘easy’ word in terms of production has two distinct meanings that usually turn up in different contexts, this is an overall win for the linguistic system (compared to one ‘easy’ and one ‘hard’ word form for the two different meanings). Crucially, though, this relies on communicators who are adept at pragmatic inferences, as Grice and other pragmaticians have long proposed.
So coming back to our example of Chinese and other ‘poor’ languages without tense: besides other strategies they have to express temporality, like adverbs (today, yesterday, now, in the past etc), their speakers can safely assume that their hearers are able to make the necessary pragmatic inferences given the context to work out what the speakers intend to communicate, therefore avoiding ambiguity, and, perish the thought, miscommunication.
Debate around the centenary of the Armenian ‘genocide’ on April 24th, 2015, has centred on the use of one specific word: ‘genocide’.
Several angles deserve consideration: history, politics / diplomacy, sociology, legal issues and indeed linguistics. Naturally, linguistic considerations are connected with others and in turn take on several facets: etymology (origin of the word), semantics (meaning), sociolinguistics (here identificational / social effects of language) and especially pragmatics (underlying implications, (intended) effects of word on the recipient). The latter, inferred / added meaning due to the context, is most relevant with the term ‘genocide’, as it extends into the political / legal (with its precise terminology).
From the origins of various “totemic” words in the semantic field (‘genocide’, ‘holocaust’ and ‘shoah’), one can identify (intended/implied) meanings and finally consider further implications.
‘Genocide’ is a hybrid formation from Greek γένος (genos, ‘race, people’), and Latin cīdere (‘kill’). It signifies the (intended) extermination of a whole people in a particular area and was coined around 1943-44 by Polish Jew Raphael Lemkin. (1) Notably, Lemkin created the term specifically with the Armenian (plus the Jewish) case in mind. Simplistically, the term ‘genocide’ thus automatically applies to the case. (2)
Before ‘genocide’, the term ‘holocaust’ was already used. It comes from Greek ὁλόκαυστον (holokauston, “something wholly burnt”), originally used in the Greek Bible version to signify “burnt offerings”. It underwent semantic change to mean “massacre / total destruction”. It was possibly first associated with burning people, as the use by journalist Leitch Ritchie in 1833 suggests: 1300 people were burnt in a church in Vitry-le-François in 1142. The word underwent extension of meaning to other cases / methods of killing. Contemporaries of the Ottoman atrocities, including Churchill, used it to refer to the Armenian case. In the aftermath of World War II (the killing of over 6 million Jews and others by the Nazis), the term first got applied specifically to these events, mainly from the 1950s onwards, to translate Hebrew ‘shoah’. Today, ‘the Holocaust’ is generally directly associated with that particular Jewish holocaust. The term is in the process of (semantic) restriction / specification (i.e. not as yet exclusive to that context).
In Israel, the word used is השואה ‘ha shoah’ (originally meaning ‘destruction’ and ‘calamity’ in Hebrew, it reflects the experience of the Jewish people, arguably the most traumatising and horrific experience imaginable). It is also preferred by many scholars since its usage (in Hebrew) precedes the use of ‘holocaust’ and it can be seen a sign of respect to use the term used by the victims of the crime against humanity. Additionally, however, some perceive a certain inappropriateness of the term ‘holocaust’ on a historical / theological and hence pragmatic level with the original meaning of a “burnt offering” to God. Jewish-American historian Laqueur reckons “it was not the intention of the Nazis to make a sacrifice of this kind and the position of the Jews was not that of a ritual victim”. Interestingly, the Armenians prefer a similar term (Մեծ Եղեռն – Medz Yeghern: “Great Evil-Crime”, often rendered as “Great Catastrophe”) over the loan translation of ‘genocide’ (Հայոց ցեղասպանություն – hayots tseghaspanutyun). (3) The Armenian term accuses the perpetrators even more clearly than shoah does. (The danger of either relativising the experiences by comparing them or isolating the events in a general humanity context by seeing them as unique is another issue.)
On the pragmatic level, US President Obama used the Armenian word in his commemoration speech. Reactions were mixed, some suggesting that he was diplomatically trying to avoid the term ‘genocide’ by using the native yet internationally unintelligible term. Either way, avoidance of the term ‘genocide’ was clearly an elegant way to get out of a diplomatic dilemma.
Some say denying a crime like genocide further victimises / dehumanises the victims and their descendants by not paying them the respect of acknowledging their suffering. This, the perception / effect of words, is where the use of the ‘right’ language is crucial. Legal implications are another issue: Were Turkey to use the word ‘genocide’ in official capacity, they would expose themselves to renewed reparation and land-return claims – similar to Greek Prime Minister Tsipras’s recent demands for reparations from World War II from Geman Chancellor Merkel. Freedom of speech is another sensitive topic on both sides: Orhan Pamuk (Turkish Nobel Prize laureate) was charged for insulting the state by publicly acknowledging the ‘genocide’, while conversely there is an ongoing international process between Doğu Perinçek and Switzerland for his being convicted for maintaining that the events did not constitute ‘genocide’ (in its common definition), emphasising the importance of precise language use, especially in a legal context.
Like Obama in terms of picking words carefully, German President Gauck in a remembrance speech all but labelled the events ‘genocide’, at the same time shrewdly abstracting from the individual case so as not to explicitly equate it with the term ‘genocide’. (4) Pragmatically (and diplomatically), these solutions leave enough room for interpretation through the use of a specific word, either in a particular language or in a proposition (statement) and implicational context.
A historically noteworthy fact that refers to the relevance of language as an identificational tool is that in the same context as the 1915 massacres happened, the turkification of the ‘nation’ and the new Turkish state, Atatürk, the Father of the Turks, also turkified the language soon after, which in Ottoman times had used the Arabic script, but also many Persian and Arabic words (today either obsolete or optional in ‘turkified’ Modern Turkish).
That this was a conscious process (at least linguistically) to create a Turkish identity, suggests the use of the term ‘genocide’ may ultimately be appropriate, and secondly that this ideological fact (the intrinsic link between the occurrences of 1915 and the establishment of the new, consciously Turkish state, compare the ongoing conflict with the Kurds) is part of the reason (beside legal / psychological ones) why Turkey remains reluctant to use the term ‘genocide’.
Some relevant quotations:
(1) “By ‘genocide’ we mean the destruction of a nation or of an ethnic group. This new word, coined by the author to denote an old practice in its modern development, (…) does not necessarily mean the immediate destruction of a nation, except when accomplished by mass killings of all members of a nation. It is intended rather to signify a coordinated plan of different actions aiming at the destruction of essential foundations of the life of national groups, with the aim of annihilating the groups themselves. Genocide is directed against the national group as an entity, and the actions involved are directed against individuals, not in their individual capacity, but as members of the national group”.
(2) “I became interested in genocide because it happened so many times. It happened to the Armenians, then after the Armenians, Hitler took action.”
(3) Turkish has Ermeni Soykırımı, literally ‘Armenian race-massacre’.
(4) “The fate of the Armenians is exemplary for the history of mass destruction, ethnic cleansing, expulsions and indeed genocides, which marks the 20th century in such a horrific way.”
Before I started my PhD project, I worked as a part-time translator for a reading club, and my major task was to translate British detective fiction into Chinese which would be sent to the small-scale, in-group “publication”. When I was asked about my work, some annoying people believed that the job was rather easy; to put it in their own words, “if you give an English-Chinese dictionary a typewriter, it will do all the work for you”, which, I think, is a prejudiced and unfair comment, not only on the translator, but also on the task of translation. When the nature of translation is finally presented to us, we will find that it is not a simple “word-matching” game. Actually, translation involves a more complicated mechanism which may cause great frustration, because we will finally get lost in translation no matter how hard we struggle. Today, I would like to name a few issues that are notable if we analyse “translation” from the perspective of semantics and pragmatics, and, due to my knowledge and preference, I will focus on the translation between English and Chinese.
The basic form of translation, if we follow the common belief, is to find a group of words in the target language that correspond to the words in the source language, and then form the sentence using all these words. That is also the basic practice of students who are starting to learn how to translate. The fundamental problem, then, comes from “finding the corresponding words”: Is there any degree of correspondence? For an ideal translation, we need to find the words between the two languages whose meanings are strictly equivalent to each other: they will have the same, fixed, unambiguous meaning, or, if the words are ambiguous, they need to be ambiguous “in the same way”, and the ambiguous meanings should form a one-to-one match between each other.
At this stage, polysemy can only make matters worse, so we should better limit our discussion to the words with “fixed, unambiguous” literal meaning. Not a lot of words hold a stable and unambiguous meaning in English; definitely we could name a few, but the number is rather small compared with the vocabulary of an average native speaker. One simple idea for the selection of such type of words is to refer to David Kaplan’s (1989) concept of character and content. Briefly, the character of a word is the “meaning” of a word which is set by the linguistic conventions, while the content is the “meaning” of a word which is determined by the character in a particular context. Following that definition, words with a stable character are suitable for our requirement. In Kaplan’s categorisation, that class only includes indexical expressions (e.g. I, here, now) and proper names (e.g. Chris Xia, University of Cambridge). If we extend the definition of proper names, we can also include a number of abstract scientific concepts, for instance helium and Higgs Boson, but the limitation is clearly shown: Only a limited number of words can have a very definite literal meaning across different languages, and we can only expect that other words will only have “roughly the same” counterparts in another language.
Even if we can successfully identify a set of words that can have strictly the same literal meaning (namely the same character) between English and Chinese, the situation is still more complex than we might expect. Take the pronouns as an example – pronouns are usually the perfect illustration of indexical terms, but the conventional use of pronoun differs between languages. Standard Chinese clearly differentiates two second person singular pronouns, ni and nin. Nin is the honorific form, which is generally used to address strangers and seniors, while ni is more casual and can be used between family members and friends. When a piece of Chinese text with the different use of ni and nin is to be translated into English, the translator can only reduce the two forms to a single word “you”, if she does not want to bother her readers with the archaic form “thou”. Although both “you” and ni/nin constantly refer to the person who is targeted in the conversation, the appearance of honorific form conveys additional information that will be lost in the translation. I do not intend to complain that English finally lost the plain form of second person singular pronoun, but it does create obstacles during the process of translation. Finding the strictly corresponding words is far away from an ideal readable translation, and that is also the reason that a strict literal translation never really exists.
Let’s take one step back and select some words that are “roughly the same” among different languages; that is much easier for the translators, and is the common practice by both professional and amateur translators. The notion of “roughly the same” already implies that the definition of a particular word in two languages is different, which indicates that the actual concept represented by the selected words can be slightly different across two languages. In the usual case, such a difference will not cause any comprehension problem, but it will still lead to miscomprehension or difficulty of understanding in the translation of certain text, and such miscomprehension is clearly language-related, or even culture-related. Imagine an excerpt of conversation in a British story book for kids: A mother said to her son who is extremely picky about his food, “you should eat more vegetables and fewer potatoes – here, eat some sweetcorn.” That utterance is perfectly natural in British English, because British people categorise potatoes as starchy food rather than a kind of vegetable, while sweetcorn is classified as vegetable. If, however, a translator follows the literal correspondence and translates this piece of advice into Chinese as “you should eat more shucai, and fewer tudou – here, eat some yumi“, the Chinese readers will surely be confused. Chinese language regards tudou (potatoes), but not yumi (sweet corn), as a member of shucai (vegetable), and how can a child eat more vegetable by eating more sweetcorn rather than potatoes? That does not make any sense in Chinese. The sense of the text is distorted by the mismatch between concepts, although the translator makes a perfect and conventional literal translation. The variance of concept encoding is another problem that translation can hardly solve, and this problem is definitely beyond the simple translation of literal meanings.
Even if we manage to overcome the difficulties of word selection and concept matching, some more complicated situations still await us. To give you a taste of such a situation, I would like to present a case that has happened to me before. When we discussed the conversational maxims proposed by Grice in my masters programme, one of my classmates, who is born and raised British, told us that “I am washing my hair” is a poorly-formed excuse if you want to reject someone’s invitation on the telephone. Later I described the setting of the conversation in Chinese to my friends from mainland China, “if you phone your friend to ask her whether she would like to go shopping with you on a sunny Saturday afternoon, and her reply is ‘I am washing my hair’, what would you think about her reply?” To my surprise, 80% of them told me, “That’s fine! She is asking me to wait for her, and we will go after she finishes washing her hair.” The difference in that example comes from the derivation of implicatures in different languages, which is totally beyond the words and sentence structure, and cannot be eliminated even if we carefully control the selection of words and the formation of a single sentence in the process of translation. The difference will still exist if you change the wording, unless the translator chooses to spell out the implicature.
Following my analysis, you may feel that an ideally faithful translation is almost a mission impossible; and trust me, it is. Since the emergence of literal meanings of words as well as, potentially, implicatures of utterances is conventional within a particular language, it is difficult to balance these conventional meanings cross-linguistically. Any form of translation will lead to the loss of some information, and the only difference is the form and the amount – either explicit information or implicit information, either more or less. Not just a matter of study for translation theories, this is also a long-term question in the field of second language comprehension and cross-linguistic pragmatics, and I believe that some relevant research may finally help to rescue this kind of loss in translation.
Kaplan, David. 1989. ‘Demonstratives’, in Joseph Almog, John Perry and Howard Wettstein (eds.), Themes from Kaplan (Oxford: Oxford University Press), pp. 481–563
Around 15 months ago, a small group of Linguistics PhD students found themselves in that most idea-and-optimism-inducing place, the local watering hole, and hatched a plan to enter the world of web logging (or blogging, to you and me). And exactly a year ago, Cam Lang Sci was born. As well as creating more of a local linguist community feel by finding out about each others’ interests, we wanted to share with you, our dear readers, some of the many facets of Language Science that make us tick. There’s been posts on sounds – and silent sounds; on words, verbs, and relative clauses; on languages dying and languages being born; on gender and language, nations and language, and thought and language. We’ve asked: What’s in a name? Where have all the implicatures gone? And, are you a Belieber? And in doing so, we’ve dipped our toes into many of the foundational areas of Linguistics: phonology, morphology, syntax, semantics, and pragmatics. And more: sociolinguistics, bilingualism, acquisition and historical linguistics. Even the practice of being a linguist itself. But there’s much more to Linguistics than all that! So we hope you’ll join us for another year of Cam Lang Sci, continue to be part of our growing readership (now almost a thousand visits a month) and in the meantime, take the chance today to sit back and catch up on some of those posts you may have missed.
Oh, and you can now follow us on facebook, too.
A number of years ago it was observed that there is more than one type of intransitive verb (i.e. of verbs describing an action, state or change where only one noun is involved – of which many examples to follow). This is true of many languages, including English.
Sometimes, the subjects of intransitive verbs behave like the subjects of transitive verbs (verbs with two nouns involved, as in Lucy plays football, Harry loves Ancient Egyptian etc.). For example, a typical property of the subject of a transitive verb is that it can be described with the verb plus the suffix -er, e.g.:
Some intransitive subjects can also be so described: worker, talker, swimmer, runner etc. But others can’t: we don’t say arriver “one who arrives”, dier “one who dies”, beer “one who is” and so forth …
But there is evidence that the subject of many of this latter group of intransitive verbs actually functions a bit like a transitive object. The object of transitive verbs can often be described with a form of the verb called the past participle, placed before the noun, for example:
Note that often an extra adjective or other modification is required for the sentence to make sense: the loved library is a bit of an odd thing to say but the much-loved library is fine; likewise we probably wouldn’t say the eaten hamburger but we might say the recently eaten hamburger or the half-eaten hamburger.
We can do a similar thing with some intransitive verbs:
This suggests the subjects of these verbs actually behave like the objects of transitives, as previously discussed. Note as well that these intransitive verbs are amongst those which can’t take the -er suffix, and furthermore that verbs which can take this suffix don’t allow the pre-noun past participle construction: we can’t say the talked man “the man who talked” or the swum woman “the woman who swam”.
This idea that some intransitive verbs at one level really have objects, instead of subjects, is for various complex reasons referred to as the unaccusative hypothesis. Of course, at another level these “objects” do behave like subjects: e.g. in a sentence they usually come before the verb; if they take the form of pronouns they have the subject forms I, he, she rather than the object forms me, him, her; etc. Thus we say I fell not fell me, and so on. There are also various complications (which I won’t go into here, because they are, well, complicated) which suggest the unaccusative hypothesis in its simplest form may be inadequate, and that some refinement is needed.
However, the hypothesis still usefully highlights a couple of things which seem to come up again and again in modern linguistics. Firstly, the way things appear “on the surface” may mask other properties. The noun Lucy looks like a subject in both Lucy talked and Lucy arrived, but in fact in the latter it seems also to have some “underlying” object properties which don’t show up so easily. Secondly, close inspection of the details of a language – including details which we might not think obviously relate to whatever it is we’re thinking about – can help reveal these underlying properties. It is this sort of close attention to detail that has allowed many of the advances in linguistics that have been made over the course of the last few decades. And even the complications I just alluded to may be useful here, in helping us come up with an even better theory.
What’s in a colour? Since more than 80% of information comes in through vision, the way we colour the world around us is quite important. Have you ever thought that the sky can be other than blue and the grass other than green in other languages? These colour categories seem universal, but in fact they are not.
Guy Deutscher finely described the problem in his book ‘Through the Language Glass. Why the World Looks Different in Other Languages’, but I will try to make a very brief overview of the linguistic ‘colour-issue’ and to mention some interesting points for further analysis.
In fact, it was William Gladstone who first noticed colour differences (at least, brought them to public attention) and set the stage for colour debate. In 1849 he published his work on Homer where he questioned why the ancient poet described the sea as wine-dark, honey as green, and sheep and iron as violet. Why were his skies never blue but iron or copper? These oddities cannot be blamed on Homer being blind or colour-blind, since other ancient Greek writers (along with the authors of the Indian Vedas, the Bible and early Chinese texts) shared this worldview. Gladstone conjectured that there was a universal anatomical deficiency in the ancient world which gradually evolved. But evolutionary studies proved that humans must have had the same degree of colour vision for millennia, which means that our vision is hardly different from Homer’s.
But if colour distinctions are not determined by anatomy, are they formed by language? Does language really determine or, at least, influence the way we colour the world around us? (as supposed by the Sapir-Whorf theory)
We tend to think of colour names in terms of our basic 11-colour paradigm. But it’s not typical of all the languages. For example, Russian has 2 words for blue (‘goluboy’ and ‘siniy’) distinguishing between light-blue and dark-blue. At the same time, pink is not considered as ‘basic’ colour in Russian, but rather as a very light hue of red. Polish also has two words for blue — ‘niebieski’ and ‘granatowy’ — but their semantics is different from Russian’s ‘goluboy’ and ‘siniy’. Japanese initially didn’t distinguish between blue and green having only one word for these hues — ‘aoi’ (which can be determined as blue with a far broader range of shades than in English). Nowadays, under the influence of the European tradition the semantics of the word has shifted towards our ‘classical’ blue. And green is now described with another word — ‘midori’. Nevertheless, the grass is still ‘aoi’ in Japanese, as well as a green traffic light. Some New Guinea Highland languages have terms only for ‘dark’ and ‘light’. Hanuno’o language, spoken in the Philippines, has only four basic colour words: black, white, red and green. Pirahã language, spoken by an Amazonian tribe, is said to have no fixed words for colours at all. According to Dan Everett, if you show them a red cup, they’re likely to say “This looks like blood”.
The issue of whether the colour spectrum is randomly carved up into categories, or whether there are universal constraints on where these categories form has long been the centre of linguistic, anthropological, psychological, and philosophical debate. Some interesting discoveries were made, the most famous of which is the one by Berlin and Kay (1969). They considered colour cognition as an innate, physiological process and discovered that colour words emerge in all languages in a predictable order. They identified eleven possible basic colour categories (white, black, red, green, yellow, blue, brown, purple, pink, orange, and grey) and found that the colours followed a specific evolutionary pattern. Black and white come first, then red, then yellow, then green and finally blue. Researchers tried to find explanations for this phenomenon in nature. Red is probably first because it is the colour of blood and of the easiest dyes to make in the wild. Green and yellow are the colours of plants. And blue is the last one because – with the exception of the sky – few things are blue in nature.
Though the theory was much critisized later on, it revolutionized and revived colour studies, which had stayed silent for almost 100 years. In the last few decades a whole number of experiments were carried out to find out whether speakers of ‘colour-deficient’ languages can see all the colours and distinguish between them.
The fact is, that we all see more or less the same. If asked to choose a lighter or a darker colour, most of us would do it properly. But a number of tests have shown that people can remember and sort coloured objects more easily if their language has a name for that colour. For instance, bilingual children of Senegal (French-Wolof) would distinguish between red and orange colours faster than monolingual Wolof children. Wolof language has only one word for these two hues, whereas French — two. And in some tests Russian speakers were faster at distinguishing certain shades of blue than English speakers (since Russian has 2 different terms for light and dark shades of blue, as has been already mentioned).
It goes without saying, that linguists are interested in describing the semantics of colour terms not just in one language, but in a range of different languages. Yet comparisons through translations (like ‘niebieski’ = blue; ‘siniy’ = blue; ‘aoi’ = blue) are absolutely unacceptable. The semantics, collocations and connotations of every word are unique.
It seems that it would make sense to look at colour categories from a cognitive perspective. The way people of different cultures set boundaries in spectrum and distinguish between certain hues depends on how they conceptualize colours, rather than on how they perceive them. Anna Wierzbicka argues that colour concepts are bound to certain universalities of human experience, such as day and night, sun, fire, vegetation, sky and earth. The number of colour terms may depend on how important it is for a certain culture to distinguish between them. For example, yellow and red hues are more relevant for Southern cultures (because of sun, sand, etc.) than blue, green or black, which can seem equal in their significance. Why invent three different words for a phenomenon conceptualized as one? Language strives for economy, hence the differences in the number of colour categories.
A curious afterthought to all this “colour-debate” is that colours are used in very different ways in different idioms across languages. For example, in English one argues until he is blue in the face, whereas in Russian he would definetely turn red. In English hair is grey, whereas in Russian it’s rather white.
And what would you say of this short advertisement?
Green bags available in seven colours.
(Placed at Cambridge University Press bookshop)
In this context it seems to acquire far more hidden meanings. 😉
When Charles Darwin (eventually) published On the Origin of Species by Means of Natural Selection in 1859, one species on whose origins he remained deliberately quiet or, at most, vague was Homo sapiens, i.e. us. That humans had evolved from a ‘lower animal’ was profoundly controversial in Darwin’s time (I say was, but it remains controversial for many even now, as Richard Dawkins continually reminds his readers). One of the major difficulties lay in accounting for the differences between humans’ mental capacities and those of other animals, and one of the chief differences concerned the evolution of language.
Darwin addressed the evolution of Homo sapiens in The Descent of Man published in 1871. It’s quite a hefty work and dedicates all of about 10 pages to the evolution of language, but those pages are full of insights and observations and many of the ideas and conclusions were ahead of their time. I’ll run through some of them, but it is well worth reading the original!
Darwin made the crucial distinction between articulate language, which he said was peculiar to humans, and things such as cries, gestures, facial expressions, etc, which are found in many species besides humans. This distinction is often blurred, especially when people talk about the evolution of “communication” as opposed to the evolution of language.
It is also common to blur the distinction between speech and language, but Darwin was careful to separate the two, with language being primarily a mental capacity. Drawing on several observations, Darwin concluded that the defining aspect of language was not in “the understanding of articulate sounds”, nor in “the mere articulation”, nor even in “the mere capacity of connecting definite sounds with definite ideas”, all of which are found in some other species or other. Instead, “the lower animals differ from man solely in his almost infinitely larger power of associating together the most diversified sounds and ideas; and this obviously depends on the high development of his mental powers” (the aspects of language which are uniquely human and those which are not might nowadays be referred to as Faculty of Language in the Narrow (FLN) and Broad (FLB) Senses respectively (see Hauser, Chomsky & Fitch 2002)). Exactly what is meant by this “almost infinitely larger power” (a vast lexicon, or the principle of compositionality, maybe?), the major point here is hard to miss – language is primarily a mental faculty and its evolution is tied in with the evolution of human cognition.
Darwin makes a very interesting analogy with birdsong. Songbirds show an instinctive tendency to sing and go through a ‘babbling’ stage, but ultimately they learn the particular song of their parents. In the same vein, particular languages have to be learned but there is an instinct to learn a language in the first place. Over evolutionary time, speaking and singing lead to improved modifications in the vocal organs, but Darwin pointed out that “the relation between the continued use of language and the development of the brain has no doubt been far more important.” As evidence that the brain and language are connected, Darwin observed that there are cases of “brain-disease” which specifically affect language or parts of language. In these observations, Darwin was using comparative and neurological evidence – a highly interdisciplinary undertaking!
Darwin argued that things like the human capacity for concept formation evolved from more rudimentary capacities. He showed that many animals form concepts and do so without language. Language is thus not a pre-requisite for concept formation, as was argued by some at the time. Incidentally, the claim in some of the recent Minimalist literature that the appearance of the operation Merge (the operation used to form sets) was the break-through moment in the evolution of language strikes me as implausible: concept formation presumably involves the ability to form sets and determine whether an entity is a member of a set or not, and if non-human animals can do this, then Merge presumably existed before language.
There’s a nice summary of Darwin’s arguments and the history of ideas of language evolution in Fitch’s book The Evolution of Language, including an entertaining section of the intellectual battle between Darwin and the linguist Max Müller. Darwin’s ideas on language evolution are by no means the final word on the matter (to the extent that there ever can be a final word on this subject), but Darwin’s ideas go to show that careful observation, interdisciplinary evidence, and courage and perseverance to pursue ideas despite various (apparent) challenges and problems can be both fruitful and illuminating.
Fitch, W. T. (2005). The Evolution of Language. Cambridge: Cambridge University Press.
Hauser, M., Chomsky, N., & Fitch, W. T. (2002). The language faculty: What is it, who has it, and how did it evolve? Science 298: 1569–1579.
You may remember that back in February I wrote about that little not-so-innocent word, ‘again’. It turned out to be a tricksy linguistic nut to crack because it appears to have two meanings. Our example was the following:
Frederick opened the door again.
This can be uttered in two different contexts, giving a repetitive and restitutive meaning of ‘again’:
a. Frederick opened the door, and he had opened it before.
b. Frederick opened the door, and it had been open before.
This can either taken to be a simple case of polysemy, or of a singularly repetitive meaning affecting different parts of a verb’s ‘internal’ meaning:
a. again (CAUSEFrederick (BECOME (openthe door)))
b. CAUSEFrederick (BECOME (again(openthe door)))
Today I’ll consider another puzzling aspect of this adverb.
In my last post I may have given the impression that ‘again’’s interesting properties only show themselves when they meet telic verbs like ‘open’ and ‘close’. Well, I’m sorry to admit that that’s not the whole story. What about these cases?
On the face of it, these look just like our open/close cases. There seems to be an ambiguity between the repetitive and restitutive reading – they can comfortably occur in contexts where the event is repeated, and where there is a counterdirectional movement. This is how von Stechow (1996) treats them, at any rate, suggesting an analysis of (BECOME[MORE [low]]), similar to (CAUSE[BECOME[open]]) – low and open are the end states.
But hang on, is there (always) an inherent endpoint for these verbs – are they telic? A classic, though not unproblematic, test of telicity is the ‘in X time / for X time’ test. Usually atelic verbs sit happily with a ‘for X time’ adjunct, while telic ones do not, preferring instead ‘in X time’. For example:
Ben read for an hour / *in an hour.
Belinda opened the window *for a second / in a second.
(NB ‘for a second’ sounds okay – but on a different reading, that Belinda opened the window and closed it again after a second, not that it took one second to open the window).
Note that telicity isn’t a property of verbs per se, at least not in isolation, but of predicates, because the arguments the verb takes affect it. We can quite happily, although rather improbably, say
Ben read Crime & Punishment in an hour.
And actually, we can frame the event of Ben’s reading Crime & Punishment in two ways, that seem to make it either telic or atelic, because this is also fine:
Ben read Crime & Punishment for an hour.
Julie walked around the park for 20 minutes / in 20 minutes.
The idea of telicity under our belt, lets return to our degree achievement verbs (like ‘widen’ and ‘cool’) and directed motion verbs (like ‘rise’ and ‘fall’). What happens when we run them through the telicity test?
The road widened for 2 metres.
?* The road widened in 2 metres.
(NB this sounds fine on the reading of ‘in 2 metres further down the road’, but that is not the one we’re interested in here)
The shares fell for 3 days.
?The shares fell in 3 days.
These examples seem to be happier (at least to me – what about for you?) with ‘for X time’. So they don’t have an inherent endpoint (after all, we don’t know anything about it, apart from it being wider or lower than the start point), can’t be analysed as (BECOME[MORE[x]]), and, on this reading, would only have a repetitive reading when combined with ‘again’.
But, wait, we just said a few paragraphs ago that these verbs also have a repetitive/restitutive ambiguity when they appear alongside ‘again’. How can we account for that? Well, I suggest that, just like ‘walk around the park’, predicates with ‘fall’, ‘widen’ and so on are themselves ambiguous, and may have telic or atelic readings. This is what Hay, Kennedy & Levin (1999) argued when they wrote that these verbs’ telicity “depends on the the boundedness of the difference value” – in other words, whether there is a maximum or minimum bound, or a fixed end point, provided. We can see this by tweaking our examples slightly:
The road widened to four lanes in 2 metres.
The shares fell to rock bottom in three days.
Suddenly the ‘in X time’ adjunct is quite fine! And this is where we can get both readings for ‘again’:
The road widened to four lanes again.
a. The road widened to four lanes and it had widened to four lanes before.
b. The road widened to four lanes and it had been four lanes before.
The shares fell to rock bottom again.
a. The shares fell to rock bottom and they had fallen to rock bottom before.
b. The shares fell to rock bottom and they had been at rock bottom before.
The really interesting thing is that this bound does not have to be explicit – it does not have to be stated as part of the sentence. Rather the speaker and hearer can ‘fill it in’ based on the context and their world knowledge.
The sheep fell down the cliff again.
Assuming that this is a small cliff and the sheep survives his descent, we again get two readings, presumably because the ground below implicitly provides a lower bound.
And where there is a restitutive context, one where a reversal of direction has been made explicit, there are always upper or lower bounds that allow both readings of again (although the context would might favour the restitutive one).
Yesterday the shares rose. Today the fell again.
Here, the fact that the shares fell after rising means that the shares must have stopped rising at some inferred point, and this can be taken by the speaker and hearer as the upper bound of the rising eventuality.
So there we have it, some more fun and games with ‘again’ (and the predicates it interacts with). But the real amusement lies – for the linguist at least – in looking out for real life examples that confirm – or question – the theory. How many ‘agains’ can you spot today?
Hay, J., Kennedy, C., & Levin, B. (1999). Scalar structure underlies telicity in“ Degree achievements.” In Proceedings of SALT (Vol. 9, pp. 127–144).
von Stechow, A. 1996. The different readings of wieder ‘again’: A structural account. Journal of Semantics 13: 87–138.
One of the things I’ve been looking at recently is a particular grammatical pattern in various languages including Middle English (i.e. English as spoken in the period 1066 to 1470-ish). Simplifying matters a bit, in older varieties of English some verbs employed have in the “perfect” construction, whereas other verbs took be:
(1) I am come, thou art gone, he is fallen …
(2) I have worked, thou hast made, she hath said …
In present-day English we basically only use have, so we use the following forms in place of those in (1):
(3) I have come, you have gone, he has fallen …
But when exactly did Middle English use have and when did it use be? The best way to answer this (the best I’ve been able to come up with at any rate) is to trawl through a great deal of text and see what patterns emerge. A body of texts put together for the purpose of trawling through to look for answers to particular questions in this way is known as a corpus. The corpus I’ve been using is the Helsinki corpus, a collection of texts up to the year 1710 – specifically the 609,000 words of texts from the period 1150-1500.
Obviously 609,000 words is a lot of words (The Lord of the Rings is about 480,000, for comparison, and my copy is 6.3cm thick in very small font). And the frequency of instances of what I’m looking for are pretty small: as a rough estimate, there about 6 instances of the perfect constructions in every thousand words, and only about 5% of all these constructions use be rather than have.
Thankfully advances in modern technology (specifically, in my case, the Microsoft Word search function) mean I don’t have to read through the entire length of the corpus hoping to spot the relevant constructions on the rare occasions when they do turn up. But even with the aid of the search facility, the process is still a rather drawn out one. There are two reasons for this: firstly, the irregularity of the verb to be, and secondly, the irregularity of English spelling in the period in question.
Regarding the first, observe that be in English has multiple different forms: be, am, are, is, were, was etc. For one thing, there are simply more forms than we find for any other verb: compare the following:
(4) I am, you are; I was, you were (different forms for different persons)
(5) I love, you love; I loved, you loved (same forms in each tense regardless of person)
For another, many of the forms of be are completely different from each other, with no shared material. Thus, whilst all the forms of love begin with the letters lov- (love, loves, loved, loving), there is no sequence of letters which is common to all the forms of be.
To make matters worse, in Middle English there were even more forms of be. art, as in thou art, was very common, and there were also forms like they weren (= they were), sindan (= they are), he/she bið (= he/she is). To get the full picture, these need searching for as well.
This is compounded still further by the second problem: spelling. Spelling in Middle English wasn’t standardised and there was a great deal of variation in how words were spelled. Even for a little word like is spellings found include is, iss, esse, ys, ysse, his, hys, hes, yes and so on and so forth. am is spelled am, eom, eam, æm, ham … All these various spellings need to be taken into account for a comprehensive survey.
Some corpora may allow you to get around this sort of problem through tagging. In a tagged corpus, each word is associated with a tag which tells you what sort of word it is. The tags used vary, but some corpora specifically mark forms of be and have with their own particular codes, which makes them a lot easier to track down. Obviously, though, the corpus has to be tagged in the first place, which is a lot of work. This can be mitigated to some extent by getting a computer to do it for you, although computers aren’t 100% accurate at this sort of thing so it still needs to be checked by a real person.
After all this, what have I discovered? I’m approaching my word limit, so I’ll have to be quick, but basically verbs in English which took be in the perfect seem to have been either “change of location” verbs like go, come, fall or “change of state” verbs like become. This is interesting because – whilst languages which have this construction vary in how many verbs take be rather than have – there’s been a prediction that if any verbs take be they will include the change of location verbs, and if the class of be verbs is any larger than that it will include the change of state verbs. So Middle English supports that prediction.
In fact, the class of verbs which took be in Middle English is much the same as in modern French (where you say je suis allé(e) “I am gone” and not *j’ai allé “I have gone”). Might this be due to contact between English and French? Probably not, because the French spoken at the time of Middle English allowed be with a much larger set of verbs. This suggests we need to seek out a deeper explanation for the similarities, rooted in the psychology of linguistic processing.
Ultimately, then, I’ve found something out, and so all this corpus-trawling has been worth it.