To boldly split what no one should split: The infinitive.

Lies your English teacher told you: “Never split an infinitive!”

To start off this series of lies in the English classroom, Rebekah told us last week about a common misconception regarding vowel length. With this week’s post, I want to show you that similar misconceptions also apply to the level of something as fundamental as word order.

The title paraphrases what is probably one of the most recognisable examples of prescriptive ungrammaticality – taken from the title sequence of the original Star Trek series, the original sentence is: To boldly go where no man has gone before. In this sentence, to is the infinitive marker which “belongs to” the verb go. But lo! Alas! The intimacy of the infinitive marker and verb is boldly hindered by an intervening adverb: boldly! This, dear readers, is thus a clear example of a split infinitive.

Or rather, “To go boldly”1

Usually an infinitive is split with an adverb, as in to boldly go. This is one of the more recognisable prescriptive rules we learn in the classroom, but the fact is that in natural speech, and in writing, we split our infinitives all the time! There are even chapters in syntax textbooks dedicated to explaining how this works in English (it’s not straightforward though, so we’ll stay away from it for now).

In fact, sometimes not splitting the infinitive leads to serious changes in meaning. Consider the examples below, where the infinitive marker is underlined, the verb it belongs to is in bold and the adverb is in italics:

(a) Mary told John calmly to leave the room

(b) Mary told John to leave the room(,) calmly

(c) Mary told John to calmly leave the room

Say I want to construct a sentence which expresses a meaning where Mary, in any manner, calm or aggressive, tells John to leave the room but to do so in a calm manner. My two options to do this without splitting the infinitive is (a) and (b). However, (a) expresses more strongly that Mary was doing the telling in a calm way. (b) is ambiguous in writing, even if we add a comma (although a little less ambiguous without the comma, or what do you think?). The only example which completely unambiguously gives us the meaning of Mary asking John to do the leaving in a calm manner is (c), i.e. the example with the split infinitive.

This confusion in meaning, caused by not splitting infinitives, becomes even more apparent depending on what adverbs we use; negation is notorious for altering meaning depending on where we place it. Consider this article title: How not to raise a rapist2. Does the article describe bad methods in raising rapists? If we split the infinitive we get How to not raise a rapist and the meaning is much clearer – we do not want to raise rapists at all, not even using good rapist-raising methods. Based on the contents of the article, I think a split infinitive in the title would have been more appropriate.

So you see, splitting the infinitive is not only commonly done in the English language, but also sometimes actually necessary to truly get our meaning across. Although, even when it’s not necessary for the meaning, as in to boldly go, we do it anyway. Thus, the persistence of anti-infinitive-splitting smells like prescriptivism to me. In fact, this particular classroom lie seems like it’s being slowly accepted for what it is (a lie), and current English language grammars don’t generally object to it. The biggest problem today seems to be that some people feel very strongly about it. The Economist’s style guide phrases the problem eloquently3:

“Happy the man who has never been told that it is wrong to split an infinitive: the ban is pointless. Unfortunately, to see it broken is so annoying to so many people that you should observe it.”

We will continue this little series of classroom lies in two weeks. Until then, start to slowly notice split infinitives around you until you start to actually go mad.


I’ve desperately searched the internet for an original source for this comic but, unfortunately, I was unsuccessful. If anyone knows it, do let me know and I will reference appropriately.

This very appropriate example came to my attention through the lecture slides presented by Prof. Nik Gisborne for the course LEL1A at the University of Edinburgh.

This quote is frequently cited in relation to the split infinitive, you can read more about their stance in the matter in this amusing post:

Lies your English teacher told you: “Long” and “short” vowels

I remember, long ago in elementary school, learning how to spell. “There are five vowels,” our teachers told us, “A, E, I, O, U. And sometimes Y.” (“That’s six!” we saucily retorted. (We were seven.))

“When a vowel is by itself,” our teachers continued,”it’s short, like in pat. When there’s a silent e at the end, the vowel is long, like in pate1.” Then there were a dozen exceptions and addenda (including the fact that A could be five different sounds), but the long and the short of it was, there are long vowels and there are short vowels.

And you know something? There are long and short vowels in English. We actually briefly discussed this before, many moons ago during our introduction to vowels, but I wanted to add a little more detail today.

The first important thing to remember is that writing is not equivalent to the language itself.2 Our spellings are generally standardized now, but they are only representations of words, and they do not dictate how a word actually sounds. Furthermore, English orthography uses five or six symbols to represent more than a dozen different vowel sounds (not exactly an efficient system). In our example above of pat and pate, these words actually contain two distinct vowels pronounced in two different places in the mouth. The same is true of the other “long” and “short” vowel pairings. It’s almost like these sounds ([æ] and [eɪ], in IPA) aren’t really related, they just timeshare a spelling.

In another sense, though, it’s not so incorrect to say that pat has a short A and pate has a long A. To illuminate this claim, we’ll need two ingredients: an understanding of vowel tenseness in English, and an important sound change from the language’s past.

For scholars of English, a more important distinction than vowel length is vowel tenseness. Like the long/short vowel spelling distinction, linguists have identified pairs of vowels that are separated by no more than a little difference in quality. The difference, though, is not a matter of length, but whether the vowel is tense or lax, i.e. whether the muscles in the mouth are more tensed or relaxed in the production of the sound. These pairings are based on the sounds’ locations in the mouth and are therefore a little different than those traditionally associated with the letters. Pate and pet demonstrate a tense-lax pairing, as do peek and pick. The sounds in these pairs are very close together in the mouth, pulled apart by the tenseness, or lack thereof, of their pronunciation.

In some dialects of English, like RP or General American, tense vowels (and diphthongs) naturally acquire a longer duration of pronunciation than lax vowels. In short, the tense vowels are long. Therefore, it wouldn’t actually be false to say that pate has a long A and pat has a short A, but the length of the vowels is an incidental feature of English’s phonology and isn’t really the important distinction between the sounds (not for linguists, anyway).

It isn’t always that way in a language, and in fact, it wasn’t always that way in English. We’ve mentioned this before, but it’s pertinent, so I’ll cover it again: in some languages, you can take a single vowel (pronounced exactly the same way, in the same place in the mouth), and whether you hold the vowel for a little length of time or for a longer length of time will give you two completely different words. This is when it become important and appropriate to talk about long and short vowels. Indeed, farther back in English, this was important. In Old English, the difference between god (God) and gōd (good) was that the second had a long vowel ([o:] as opposed to [o], for the IPA fluent). In all other respects, the vowel was the same, what many English speakers today would think of as the long O sound.

In a way, these Old English long/short vowel pairings are really what we’re referring to when we talk about long and short vowels in English today (even if we don’t realize it). The historic long vowels were the ones affected by the Great English Vowel Shift, and the results are today’s colloquially “long” vowels. The short vowels have largely remained the same over the years. Maybe in this sense, as well, it’s not so bad to keep on thinking of our modern vowels as long and short. So many other quirky aspects of English are historic relics; why not this, too?

In the end, maybe the modern elementary school myth of long and short vowels isn’t entirely untrue, but there’s certainly a lot more to the story.


1 This is a delightful, if somewhat archaic, word for the crown of the head. I love language.
2 I imagine some of our longtime readers are fondly shaking their heads at our stubborn insistence on getting this message across. Maybe it’s time we made tee shirts.

One Nation, Many Languages

Lies your geography teacher told you

We all know that each country has one and only one language, right?

In China they speak Chinese, in England they speak English, in Iran they speak Farsi, and each language is neatly contained within the borders of its respective state, immediately switching to another language as soon as these are crossed.

Well, if you’ve been reading our blog, you have probably become rather sceptical of categorical statements like this, and for good reason: it turns out, in fact, that a situation like the one described above is pretty much unheard of. Languages spread across borders, sometimes far into a neighbouring country, and even within the borders of a relatively small state it’s not uncommon to have four or five languages spoken, sometimes even more, and large countries can have hundreds or more.

Then there’s the island of New Guinea, which fits 1,000 languages (more than some continents) in an area slightly bigger than France.

And yet, this transparent lie is what we are all taught in school. Why? Well, you can thank those dastardly Victorians again.

Before the rise of nationalism in the late 18th century, it was common knowledge that languages varied across very short distances, and being multilingual was the rule, not the exception, for most people. Even as a peasant, you spoke the language of your own state and one or two languages from neighbouring countries (which at the time were probably a few miles away, at most). Sure, most larger political entities had lingua francas, such as Latin or a prestige language selected amongst the varieties spoken within the borders (usually the language of the capital), but this was never seen as anything more than a way to facilitate communication.

It was the Victorian obsession with national unity and conformity which slowly transformed all languages different from the arbitrarily chosen “national language” into marks of ignorance, provincialism, and, during the fever pitch reached in the 1930s, even treason; this led to policies of brutal language suppression, which resulted the near-extinction of many of the native languages of Europe.

Why then is this kind of thing still taught in schools? Because, sad to say, things have only become slightly better since those dark times. Most modern countries still accept the “One Nation, One Language” doctrine as a fact of life without giving it a second thought. Some countries still proudly and openly enact policies of language suppression aimed at eliminating any language different from the national standard (je parle à toi, ma belle France…).

Which brings me to our case study: my own Italy.

La bella Italia

Given my tirade above, it should not come as a surprise to you now when I tell you that Italian is not the only language spoken in Italy. Not by a long shot. In fact, by some counts, there are as much as 35! The map below shows their distribution.

What is today known as Standard Italian (or simply Italian) is a rather polished version of the Tuscan language (shown as TO on the map). Why not Central Italian, the language of Rome? For rather complex reasons which have to do with the Renaissance, and which we won’t delve into here, lest this post become a hundred pages long.

Even though Italy stopped enforcing its language suppression policies after WWII, it is a sad fact that even the healthiest of Italian languages are today classified as “vulnerable” by UNESCO in its Atlas of the World’s Languages in Danger, with most of them in the “definitely endangered” category.

The Italian government only recognises a handful of these as separate languages, either because they’re so different it would be ludicrous to claim they’re varieties of Italian (such as Greek, Albanian and various Slavic and Germanic languages spoken in the North), or because of political considerations due to particularly strong separatist tendencies (such as Sardinian or Friulan, spoken in the Sardinia and Friuli-Venezia Giulia regions, respectively). All other languages have no official status, and are generally referred to as “dialects” of Italian, even though some are as different from Italian as French is![1]

Stereotypically, speaking one of these languages is a sign of poor education, sometimes even boorishness: in the popular eye, you’re not speaking a different language, you’re simply speaking Italian wrong.[2]

To see how deep the brainwashing goes: suffice to say that it’s not uncommon, when travelling to areas where these languages are still commonly spoken, to address a local in Italian and receive an answer in the local language. When it becomes clear to them that you don’t understand a word of what they’re saying, the locals are often puzzled and surprised, because they’re sincerely convinced they’re speaking Italian!

To better highlight the differences between Italian and these languages, here’s the same short passage in Italian and in my own regional language, Emilian (Bologna dialect):


Si bisticciavano un giorno il Vento di Tramontana e il Sole, l’uno pretendendo d’esser più forte dell’altro, quando videro un viaggiatore, che veniva innanzi avvolto nel mantello. I due litiganti convennero allora che si sarebbe ritenuto più forte chi fosse riuscito a far sì che il viaggiatore si togliesse il mantello di dosso.


Un dé al Vänt ed såtta e al Såul i tacagnèven, parché ognón l avêva la pretaiśa d èser pió fôrt che cl èter. A un zêrt pónt i vdénn un òmen ch’al vgnêva inànz arvujè int una caparèla. Alåura, pr arsôlver la lît, i cunvgnénn ch’al srêv stè cunsidrè pió fôrt quall ed låur ch’al fóss arivè d åura ed fèr in môd che cl òmen al s cavéss la caparèla d’indòs.

Pretty different, aren’t they?

You can hear the Italian version read aloud here, and here is the Emilian version[3].

Here’s the English version of the same passage for reference:

The North Wind and the Sun were disputing which was the stronger, when a traveller came along wrapped in a warm cloak. They agreed that the one who first succeeded in making the traveller take his cloak off should be considered stronger than the other.

It is pretty hard to argue that these two are the same language, and yet this is what most people in Italy believe, thinking of Emilian as a distorted or corrupted form of Italian.

Compare this to the situation during the Renaissance, when Emilian was actually a very prestigious language, to the point that Dante himself once wrote an essay defending it from those who would claim the superiority of Latin, calling it the most elegant of the languages of Italy.


Italy is by no means an isolated example, as I’ve already made clear in the first section of this post: wherever you go in the world, you’ll find dozens of languages being suppressed and driven to extinction due to myopic language policies left over from an era of nationalism and intolerance.

The good news is that the situation is improving: in Italy, regional languages are not stigmatised as they once were. In fact, many people take pride in speaking their local language, and steps are being taken to teach it to the youngest generations and preserving them through literature and modern media. However, the damage done in the past is enormous, and it will take an equally enormous effort to restore these languages to the level of health they enjoyed a hundred years ago. For some of them it might very well be too late.

So if you speak a minority language, or know someone who does, take pride in it. Teach it to your children. They’re not “useless”, they’re not marks of poor education, they are languages, as dignified and deep as any national language.

And don’t mind the naysayers: whenever someone tells me Emilian is a language for farmers, incapable of the breadth of expression displayed by Italian, I remind them that when Mozart studied music in Bologna, he spoke Emilian, not Italian; and that when the oldest university in the western world opened its doors in 1088, and for 700 years after that, it was Emilian, not Italian, that was spoken in its halls.

  1. Lisa discussed the tricky question of  what’s a language and what’s a dialect here
  2. The same thing that happens to Scots or AAVE. See here
  3. The passages are taken from a short story used to compare different italian regional languages. All currently recorded versions can be found here.


ᚺᛖᛚᛚᛟ ᛞᛖᚫᚱ ᚠᛟᛚᛚᛟᚹᛖᚱᛋ!

ᚺᛖᛚᛚᛟ ᛞᛖᚫᚱ ᚠᛟᛚᛚᛟᚹᛖᚱᛋ!
Hello dear followers!

Welcome back from summer vacation! Sabina here and, boy, do we have a treat for you today! Today, we’re going to talk runes! When people see this fascinating little writing system, they tend to think of Vikings, so I guess it makes sense that one of our nordic contributors write this post. For me, runes were the initial introduction to linguistics (though I didn’t realise that at the time), and they are still very dear to my heart, so if I get a bit caught up in it, please forgive me.

Though it might make quite a bit of sense to think about Vikings when seeing a runic inscription, the runic writing system actually comes in many varieties and was used in a number of Germanic languages before the Latin alphabet.

First off, let’s check out some things that differ between the Runic writing systems and the one we are using here today (i.e. the Latin alphabet). There are, of course, a number of them, but let’s check out some basic differences for now.

Let’s start with looking at the material on which most Runic inscriptions are found (it’ll be important in a sec, I promise): Rather than paper, most runic inscriptions are found on wood, stone, or even metal. This may just be due to easy access; it was certainly a lot easier to get a hold of a piece of rock than parchment in the days when runic writing was used.

Now, this is where the material becomes important: runes distinctly lack a rounded shape, most of them being angular. One could argue that this may just have been easier to carve into the hard surface, but some believe that the angular shape actually reveals something more about the origins of the runic writing system. You might be thinking, “it must be somewhere in Scandinavia” because you got hung up on Vikings. That, however, may be far from the truth (though, as in most things concerning historical linguistics, we simply can’t know for sure). Some argue that the lack of rounded shapes in the Runic alphabets may be an indication of an Old Italic origin (remember, Latin is an Italic script). Some Old Italic scripts, e.g. Etruscan or Raetic, share this angular property with the Runic alphabets, and some scholars argue that the Runic alphabets are derived from these, probably through early contact between the Germanic languages and the Old Italic ones. Some even believe that runes might actually derive from the Latin alphabet itself. So, while you might be inclined to think that there is a world of difference between the symbols used to write ‘ᚺᛖᛚᛚᛟ’ and ‘hello’, the symbols used in the former may be derived from an ancestor of the latter! (I love writing systems, have I ever said so? Well, it’s worth saying again).

Now, two more things to be noted about the Runic alphabets, before we dig into an overview of the ones that have been used: firstly, in the earliest Runic inscriptions, they didn’t have a fixed writing direction. This means that, unlike our modern script, the earliest Runic inscriptions could be written (and read) either left-to-right or right-to-left (trust me, you want to keep this in mind if you plan to study early runic inscriptions to any great extent. It can get really confusing otherwise, since the writing direction may actually change within the same inscription). It stabilized into a left-to-right pattern later on, though.
Secondly, word division is not commonly used. Basically, itmeansthatrunesarewrittenlikethis. Kinda hard to read, huh? (alright, I was kinda nice to you guys and put in some word division in my hello today but, really, something like this: ᚺᛖᛚᛚᛟᛞᛖᚫᚱᚠᛟᛚᛚᛟᚹᛖᚱᛋ would be more correct) Check out the Franks Casket, an amazing little relic with an Old Norse poem written in runes on it, here to see an example of how this may look. Actually, check out the Casket even if you don’t want to see this specifically; it is still awesome. Sometimes word division was indicated by one or more dots, but that was somewhat unusual.

Now, let’s dig into the most famous Runic alphabets, shall we?

Some of you may think that there was just one kind of runic alphabet – you’re in for a treat! There were, in fact, several. We will mention three today: the Elder Futhark, the Younger Futhark and the Futhorc. Notice the names are very similar? Well, that’s because the alphabets are named after the first six letters, which just happens to spell out ‘futhark’ (or futhorc).

The Elder Futhark is the oldest recorded variety of the runic alphabets, used approximately between the 2nd and 8th centuries AD. It consisted of 24 characters, typically divided into three ættir (compare with Swedish ‘ätter’ meaning ‘family/clan’), each ætt including eight characters, as below.


As you may know, runes were also considered to have certain magical properties, and the very word ‘rune’ means ‘secret’ or ‘mystery’. Though we won’t go into detail here, the first ætt is typically considered to be the ætt of the Norse fertility deities Frey and Freya. The second is the ætt of Heimdall, the guy who watches for the start of Ragnarrök (the end of the world, in case you missed the movie), while the third is considered to be the skygod Tyr’s1.

Now, the Elder Futhark eventually gave way to the Younger Futhark around the 8th century. The Younger Futhark is a reduced version of the Elder Futhark and only contains 16 letters. The Younger Futhark is the Runic alphabet most people think about when we’re talking Viking runes. However, even in the Viking-countries (i.e. the Scandinavian ones), the Younger Futhark varied. In Denmark, we can recognise so called ‘long-branch’ runes:

While in Sweden and in Norway, we see ‘short-twig’ runes:

Let’s complicate it just a liiiittle bit more because in Sweden, you have yet another set called the Bohuslän runes, used specifically in the west coast region (Bohuslän), north of (and including) the city of Gothenburg (coincidently, my hometown). Interestingly enough, this is a set of not 16 letters but 26; 2 more than the original Elder Futhark.

Alright, now that we’ve covered the Elder and Younger Futhark, let’s step over to the Futhorc. Notice the difference in name? Based on what we’ve said previously on language change and the early Germanic dialects, do you think you could guess who used these runes?

Do you have an answer in mind? Is it perhaps the Anglo-Saxons? In that case, you are absolutely right!

The Anglo-Saxon runes, or the Futhorc, is an extended, rather than reduced, version of the Elder Futhark. Instead of the Elder Futhark’s 24 letters, the Futhorc has between 26 and 33 letters (yeah, I know, but I can’t give you a definite number!). How they wound up in the UK (where you can find them on, for example, the Franks Casket mentioned above or the Kingmoor ring, which is inscribed with a magical formula) is still much discussed, though one hypothesis is that it was developed in Frisia. The language of Frisia, Old Frisian, is a closely related kin to Old English and, indeed, we do find that these runes were used also in Old Frisian. Another suggestion is that the Vikings brought them over and the Anglo-Saxons modified them a bit and then spread them to Frisia.

Anyway, the Futhorc was used from approximately the 5th century and was used in England all the way up to the 10th or 11th century. Its use was in decline from about the 7th century, and it largely ceased after the Norman Conquest. Despite this, you can actually see a couple of the old runic symbols tagging along during the Middle English era, as well, specifically the letter wynn <ƿ> and the letter thorn <þ>. Now, while these might look similar, do not mix them up! In Modern English, the former is the letter <w> while the latter is the digraph <th>, so you may get very confused if you do. Also, if you are to read a Middle English manuscript you might come across a letter that looks suspiciously like <y>. Don’t confuse that one, either. It may be either wynn or thorn, and the scribe just missed the line that connects the rounded shape to the vertical line. In fact, this kind of confusion is exactly where we get ‘ye’, as in ‘ye olde’, from.

Right, sidetracked. Getting back to it.

Anyway, the Futhorc looks like this:

Quite a difference from what we saw in the Elder and Younger Futhark, huh? Like everything else in language, variation is the spice of life; it just adds a bit of zest, don’t you think? (Though, admittedly, making it all the more difficult to learn.)

I’ve hammered you with runes for quite a bit today, haven’t I? I did try to restrain myself, honest, but runes are just so awesome, I couldn’t help myself.

Until next time, ladies and gents. I hope you enjoyed our little runic talk! Come back to us in two weeks when our amazing Riccardo will be here to talk to you about the endangered languages of Italy!


1Check out our reference D. Jason Cooper’s more in-depth account on the different ættir here

Most of our references today are from a marvellous little page called Omniglot. You’ll find our source regarding the Elder Futhark, the Younger Futhark and the Futhorc right there as well as some general info on the runic writing systems. Also, the original runic pics modified for the purposes of this post are to be found on Omniglot, in the links that have been provided. Take a look and be dazzled! Also check out the Futhark on ancientscriptscom, our second source for the different hypotheses regarding the origin of the runic writing system. Enjoy!

Standardisation of languages – life or death?

Hello and happy summer! (And happy winter to those of you in the Southern Hemisphere!)

In previous posts we’ve thrown around the term ‘standard’, as in Standard English, but we haven’t really gone into what that means. It may seem intuitive to some, but this is actually quite a technical term that is earned through a lengthy process and, as is often the case, it is not awarded easily or to just any variety of a language. Today, I will briefly describe the process of standardising a variety and give you a few thoughts for discussion1. I want to stress that though we will discuss the question, I don’t necessarily think we need to find an answer to whether standardisation is “good” or “bad” – I don’t think either conclusion would be very productive. Still, it’s always good to tug a little bit at the tight boundaries we often put around the thought space reserved for linguistic concepts.

The language bohemian, at it again.

There are four processes usually involved in the standardisation of a language: selection, elaboration, codification, and acceptance.


It sure doesn’t start easy. Selection is arguably the most controversial of the processes as this is the step that involves choosing which varieties and forms the standard will be based on. Often in history we find a standard being selected from a prestigious variety, such as the one spoken by the nobility. In modern times this is less comme il faut as nobility don’t have monopoly on literacy and wider communication anymore (thankfully). This can make selection even trickier, though: as the choice of a standard variety becomes more open there is a higher need for sensitivity regarding who is represented by that standard and who isn’t. Selection may still favour an elite group of speakers, even if they may no longer be as clear-cut as a noble class. For example, a standard is often based on the variety spoken in the capital, or the cultural centre, of a nation. The selection of standard forms entails non-selection of others, and these forms are then easily perceived as worse, which affects the speakers of these non-standard forms negatively – this particularly becomes an issue when the standard is selected from a prestigious variety.

In my post about Scots , I briefly mentioned the problem of selection we would face in a standardisation of Scots as a variety which has great variation both within individual speakers and among different speakers (e.g. in terms of lects). Battling this same tricky problem, Standard Basque was mostly constructed from three Basque varieties, mixed with features of others. This standard was initially used mainly by the media and in formal writing with no “real” speakers. However, as more and more previously non-Basque-speaking people in the Basque country started to learn the language, they acquired the standard variety, with the result that this group and their children now speak a variety of Basque which is very similar to the standard.


Standardisation isn’t all a prestigious minefield. A quite fun and creative process of standardisation is elaboration, which involves expanding the language to be appropriate for use in all necessary contexts. This can be done by either adapting or adopting words from other varieties (i.e. other languages or nonstandard lects), by constructing new words using tools (like morphology) from within the variety that’s becoming a standard, or by looking into archaic words from the history of the variety and putting them back into use.

When French was losing its prestige in medieval England, influenced no doubt by the Hundred Years’ War, an effort was initiated to elaborate English. During the Norman Conquest, French had become the language used for formal purposes in England, while English survived as spoken by the common people. This elaboration a few hundred years later involved heavy borrowing of words from French (e.g. ‘government’ and ‘royal’) for use in legal, political, and royal contexts (and from Latin, mainly in medical contexts) – the result was that English could now be used in those situations it previously didn’t have appropriate words for (or where such words had not been in use for centuries)2.



Once selection and elaboration have (mostly) taken place, the process of codification cements the selected standard forms, through, for example, the compilation of dictionaries and grammars. This does not always involve pronunciation, although it can, as it famously does in the British Received Pronunciation (usually just called RP), a modern form of which is still encouraged for use by teachers and other public professions. Codification is the process that ultimately establishes what is correct and what isn’t within the standard – this makes codification the sword of the prescriptivist, meaning that codification is used to argue what the right way to use the language is (y’all know by know what the HLC thinks of prescriptivism).

When forms are codified they are not easily changed, which is why we still see some bizarre spellings in English today.  There are of course not only limitations to codification (as with the spelling example)– there is obvious benefit for communication if we all spell certain things the same way or don’t vary our word choices too much for the same thing or concept. Another benefit, and a big one at that, is that codified varieties are perceived more as real, and this is very important for speakers’ sense of value and identity.

Codification does not a standard make – most of you will know that many varieties have dictionaries without having a standard, Scots being one example. Urban Dictionary is another very good example of codification of non-standard forms.


The final process is surely the lengthiest and perhaps the most difficult to achieve: acceptance. It is crucial that a standard variety receives recognition as such, more especially by officials or other influential speakers but also by the general public. Speakers need to see that there is a use for the standard and that there is a benefit to using it (such as benefiting in social standing or in a career). Generally though, people don’t respond very well to being prescribed language norms, which we have discussed previously, so when standard forms have been selected and codified it does not necessarily lead to people using these forms in their speech (as was initially the case with Standard Basque). Further, if the selection process is done without sensitivity, some groups may feel they have no connection to the standard, sometimes for social or political reasons, and may actively choose to not use it. Again, we find that a sense of identity is significant to us when it comes to language; it is important for us to feel represented by our standard variety.

What’s the use?

Ideally, a standard language could be seen as a way to promote communication within a nation or across several nations. Despite the different varieties of Arabic, for example, Arabic speakers are able to switch to a standard when communicating with each other even if they are from different countries far apart. Likewise, a Scottish person can use Standard English when talking to someone from Australia, while if the same speakers switched back to their local English (or Scots) varieties, they wouldn’t necessarily understand each other. Standardisation certainly eases communication within a country also, and a shared standard variety can provide a sense of shared nationality and culture. There is definitely a point in having a written standard used for our laws, education, politics, and other official purposes which is accessible for everyone. On the other side of this, however, we find a counterforce with speaker communities wanting to preserve their lects and actively opposing using a standard if they can’t identify with it.

So, a thought for discussion I want to leave with you today: Do you think the process of standardisation essentially kills language, or does it it keep it alive? An argument for the first point is that standardisation limits variation3 – this means that when a standard has been established and accepted, the varieties of that standard will naturally start pulling towards the standard as its prestige and use increases. However, standardising is also a way to officially recognise minority varieties, which gives speakers an incentive to keep their language alive. It is also a way to ease understanding between speakers (as explained earlier), and in some cases (like Basque), standardisation gives birth to a new variety acquired as a first language. As I said from the start, maybe we won’t find an answer to this, and maybe we shouldn’t, but it’s worth thinking about these matters in a more critical way.


1 I’ve used the contents of several courses, lectures, and literatures as sources for this post. The four processes of standardisation are credited to Haugen (1996): ‘Dialect, language, nation’.

2 In fact, a large bulk of French borrowings into English comes from this elaboration, rather than from language contact during the Norman Conquest.

On a very HLC note, historical standardisation makes research into dialectal variation and language change quite difficult. The standard written form of Old English is based on the West Saxon variety, and there are far fewer documents to be found written in Northumbrian, which was a quite different variety and has played a huge part in the development of the English we know today.


Sherlock Nouns and the Case of Morphological Declension

Ah, nouns. Classically defined as “people, places, and things,”1 these little (and sometimes not so little) words can carry a lot of meaning, encompassing everything from cats to triskaidekaphobia2. Pair them with verbs (those things you do), and you’ve really got something.

In English, there’s a comforting solidity to nouns. Not like verbs, that throw on endings and even, le gasp, change vowels like they’re trying on hats. Nouns, now—nouns are dependable.

Or so you thought. When you change the form of a verb to reflect who’s doing what and when, that’s called conjugation. Here’s the bombshell: nouns can do that, too. It’s called declension.

In some languages, the form of the noun changes to indicate its role in a sentence. For example, a noun may have one form when it’s the subject of a sentence but have a different form when it’s the object. (As a refresher: in ‘Rebekah wants haggis’, ‘Rebekah’ is the subject, and ‘haggis’ is the object.) These noun forms are called cases. Adjectives, pronouns, participles, numerals, and demonstratives (this or that) can also decline. Declension occurs in languages like, oh, English. Or Spanish. (Just a little bit.)

In English and Spanish, the presence of cases is most evident in their pronouns:

English Spanish
subject he él
direct object him lo
indirect object him le
possessive his/hisn su/suyo
reflexive himself se

(Hisn is a dialectal form like mine for the third person.)

For regular nouns, English only distinguishes between singular and plural and between possessive and non-possessive. Spanish distinguishes between singular and plural and declines for grammatical gender (e.g. the adjective blanco will become feminine blanca when describing la tortuga blanca ‘the white turtle’). The diversity of their pronoun forms3 is a remnant of their parent languages, Old English and Latin respectively. These older languages had full, healthy case systems that affected all their nouns. They in turn inherited their noun cases from a common ancestor, namely Indo-European (IE).

The Indo-European Noun Cases

Based on the structure of its surviving daughters, linguists have determined that Proto Indo-European had eight noun cases:

case role example in an English sentence
nominative subject amīcus ‘boy’/puella ‘girl’ (Lat) The boy plays.
accusative direct object amīcum/puellam He loves the girl.
dative indirect object amīcō/puellae He gives the girl a flower.
ablative movement away from amīcō/puellā She runs from the boy.
genitive possessive amīcī/puellae The boy’s tears
vocative addressee amīce/puella Boy, where art thou?
locative physical or temporal location domī ‘at home (Lat) She stays at home.
instrumental by means of which something is done þȳ stāne ‘with a stone’ (OE) He raps on her window with a stone.


This is a rather simplified representation of the situation. The actual distinctions and usages of the cases vary from language to language, particularly because very few IE languages utilize all eight cases (like Sanskrit does). It’s the nature of languages to change, and cases have a propensity to merge, a process called syncretism4. It’s like when you’re working on a group project, and half the group doesn’t show up, leaving the kids who want a good grade to pull double duty and fulfill the delinquents’ obligations as well as their own. For example, in Old English, the dative case fills some of the same uses as the ablative case in Latin because Old English doesn’t have an ablative.

The case of noun cases shook out a little differently across the Indo-European language family. As previously mentioned, Sanskrit has eight cases. Latin has seven. Old English has five. Icelandic and German have four (although German doesn’t show it on nouns so much as on articles and adjectives). And languages like English and Spanish don’t so much have cases anymore as much as they have pictures of their old case-infused relatives hanging on their walls.

A college classmate of mine once stated rather authoritatively that the reason the modern Romance languages have generally done away with cases is because it’s too hard to decline all those Latin nouns in your head. To be fair, Latin has five different groups of nouns (called declensions), all with their own endings for Latin’s seven cases. And it is true that many modern IE languages employ far fewer cases than their ancestors, if any at all. But the idea that cases are too hard for our brains to manage in everyday speech? Hogwash. Russian, another IE language that is very much alive and kicking, has six cases. Our friend Finnish (of Uralic descent) has fifteen. (You should also take from the example of Finnish that noun cases are not unique to the Indo-European languages.)

We’ve discussed before (repeatedly) that one language isn’t really harder than any other; they’re just different. The human brain is well equipped to utilize any of them it can get its neurons on. If our homo sapien super computers couldn’t handle a given linguistic structure, it wouldn’t develop. Easy as pie.

To Word Order or Not to Word Order?

Now, a robust system of noun cases (and verb conjugation) in a language can affect more than just the morphology. Because so much important information is embedded in the words themselves, word order is less important and more flexible than in languages like Modern English.

In Old English, ‘Se hlāford lufaþ þā frōwe’ and ‘Þā frōwe lufaþ se hlāford’ both mean ‘The lord loves the lady.’ In Modern English, ‘The lord loves the lady’ and ‘The lady loves the lord’ have very different meanings (although, for the sake of romance, one hopes that both statements are equally true). To say ‘The lady loves the lord’ in Old English, you would decline the nouns differently and say ‘Sēo frōwe lufaþ þone hlāford.’ (Maybe this wasn’t the best example as there aren’t noticeably distinct ending on the verbs, but you can see the difference in case in the demonstratives.) This is not to say that Old English doesn’t have rules about word order, but it’s less crucial than in today’s English.

Languages that rely on declension and conjugation (both types of inflection) to convey meaning are called synthetic languages. Languages that rely more on word order are called analytic. These distinctions are not binary but rather are a matter of degree.

So, there you have it. (It being a brief rundown on noun cases.) As parts of speech go, nouns are pretty straightforward. But like a duck paddling on water, nature’s got a lot of beautiful stuff going on underneath the surface.


1 Thanks to Schoolhouse Rock.
2 A fear of the number 13.
3 Pronouns generally resist change (the stubborn things), hence the moderate survival of cases where they were generally lost throughout the rest of the language.
4 This phenomenon is propelled by things like sound change. If the endings for two cases start to sound identical, it becomes hard to distinguish them as separate forms.

Chaos? Nah, just a vowel shift

Dearest creature in creation,
Study English pronunciation.
I will teach you in my verse
Sounds like corpse, corps, horse, and worse.
I will keep you, Suzy, busy,
Make your head with heat grow dizzy.
Tear in eye, your dress will tear.
So shall I!  Oh hear my prayer.
Pray, console your loving poet,
Make my coat look new, dear, sew it!


Just compare heart, beard, and heard,
Dies and diet, lord and word,
Sword and sward, retain and Britain.
(Mind the latter, how it’s written.)
Now I surely will not plague you
With such words as plaque and ague.
But be careful how you speak:
Say break and steak, but bleak and streak;
Cloven, oven, how and low,
Script, receipt, show, poem, and toe.

Finally, which rhymes with enough —
Though, through, plough, or dough, or cough?
Hiccough has the sound of cup.
My advice is to give up!!!


Gosh, English pronunciation can be really tricky at times, can’t it? Interested in knowing why?

Well, of course you are! Let’s dive into it together!

As the excerpt above clearly shows, English spelling is often considered a bit ’off’, poorly corresponding to the written word. That’s true, it often doesn’t. But why is that?

Well, while it is not the only reason behind this tricky correspondence between the spoken and written word, today’s topic does explain a lot: the ‘Great’ English Vowel Shift (let’s stick to calling it the GVS from now on) came along and messed things up quite a bit.

Some of you will probably have heard about the GVS before; it was a significant sound change that occurred primarily during the Middle Ages. This sound change affected the long vowels of Middle English, causing them to shift like so:



Great, so… we done here? You now know everything there is to know about the GVS, right?

Nah, not really.

First, the GVS is actually considered by a lot of linguists to be a process of at least two phases3:

The first phase is considered to have lasted up until approximately the year 1500. During this phase, the long high Middle English vowels /i:/ and /u:/, pronounced similar to the vowels in Modern English meet [mi:t] and lute [lu:t], diphthongised and eventually became the modern English diphthongs /aɪ/ and /aʊ/, the pronunciations you find in mice [maɪs] and mouse [maʊs]. The vowels immediately below them, that is /e:/ and /o:/4, raised one position, falling into the slots previously held by /i:/ and /u:/.

In the second phase, often considered to have been active between the late 16th to mid-17th centuries, the remaining vowels, that is /ᴐ:, a:, ɛ:/, raised one position in height.

What we eventually wind up with is a system of vowels completely changed from its predecessor.

Now, why would that happen?

As with a good number of things in historical linguistics, we don’t exactly know. However, there are two leading hypotheses out there.

The first is the so-called push-chain theory, which was introduced by the great German philologer Karl Luick as early as 1896. Luick argued that the GVS must have been initiated by the movement of the lower vowels /e:/ and /o:/. The two vowels, for some mysterious reason of their own, started to move toward the high vowels /i:/ and /u:/. As they drew nearer, /i:/ and /u:/ started panicking because, it is sometimes argued, they couldn’t raise any higher and remain vowels (instead becoming yucky consonants, bläch).

Well, can’t have that, can we? In pure desperation, /i:/ and /u:/ look for a way out. And they find one—move in (or out, if you will). So, that is precisely what they do, they move in: they become diphthongs, lower and, suddenly, Middle English /i:/ and /u:/ become modern English /əɪ/ and /əʊ/, eventually becoming /aɪ/ and /aʊ/. Tadaa, we have the first steps to a modern English vowel system.

Luick’s hypothesis is actually quite elegant in a way because it successfully explains the lack of diphthongisation of /u:/ in the northern dialects of British English. In these dialects, the vowel /o:/ had previously fronted, becoming /ø:/. The northern dialects therefore didn’t have a vowel /o:/ to push /u:/ out of its place, and the diphthongisation never happened there (pretty neat, huh?).

The second of our hypotheses, the drag-chain theory, was introduced by Otto Jespersen in 1909. Now, Jespersen argued that it was equally likely that the diphthongisation of the high vowels initiated the shift. Basically, Jespersen’s reasoning was like this:

The high vowels, i.e. /i:, u:/, shifted and became diphthongs. That left a ‘gap’ in the vowel system. Horrified, the lower vowels scrambled to move up the ladder to fill the gaps. All of the sudden, Middle English /a:/ became early Modern English /ɛ:/, Middle English /ɛ:/ became early Modern English /e:/ and so on (the back vowels tagged along, too), and so, harmony was restored.

Now, the (to me, at least) flaw of this hypothesis is that it doesn’t account for the non-diphthongisation of northern /u:/, but then again, Luick’s hypothesis claiming that the high vowels couldn’t raise any higher has been noted to be somewhat limited—the high vowels could have done several other things to avoid becoming consonants5. But that’s a different discussion.

Regardless of which of these hypotheses you want to consider more likely, this is the ‘Great’ English Vowel Shift: a huuuuge chain shift that took centuries to complete and affected all long vowels of Middle English. That’s a pretty big deal.

Now, you might be wondering what this has to do with spelling, right? Well, you see, the thing is that English spelling started to become standardized during the ongoing GVS. What this means is that we have a bunch of words where the written form corresponds to a pronunciation that is centuries old. So, basically, meet and meat, both pronounced [mi:t] in British English, are spelled differently because, when those high and mighty people speaking Middle English decided that there was a correct way to spell those words, they did have distinct pronunciations!

So, next time you get annoyed by having to look up how you spell something, just stop and consider that you’re actually spelling the word the way it was pronounced about 600 years ago. Pretty cool, huh?





Oh, oh! I almost forgot! Have you been asking yourself why I keep using ‘’ around ‘Great’? No? Well, I’m going to tell you anyway!

The ‘Great’ was introduced by Jespersen and, frankly, while the GVS did indeed have a huge effect on the English language, vowel shifts happen all the time. So, take the ‘Great’ with a pinch of salt and a shot of tequila and we might get on the right track of things.





Side notes

1.   There is nothing to say that either of these hypotheses is an accurate description on the initial process of the GVS. Long before I took my first bumbling steps into academia (actually, about a year before I was even born), Donka Minkova and Robert Stockwell noted that it may just be the desire to see a systematic aspect of language and discount its random quirks. So, don’t take it too seriously.

2.     If you’d like to read more about the GVS and other hypotheses, please take a look at Gjertrud Flermoen Stenbrenden’s dissertation work The Chronology and regional spread of long-vowel changes in English from 2010. It’s a really interesting read and introduces a lot more on the subject than I could possibly cover here.


1 This is an excerpt of the excellent poem The Chaos by Dr. Gerard Nolst Trenité (Netherlands, 1870-1946). Translated by Pete Zakel.

2 This is one of the common ways to depict the GVS, a similar one can be found in most textbooks on the subject. See, for example, Historical Linguistics by Theodora Bynon (1977: 82)

3 See for example The Cambridge History of the English Language (2008) in which Roger Lass writes about this division into two phases. A similar explanation can be found in most textbooks on linguistics that deal, in some way, with historical linguistics (though I really recommend reading Lass’ explanation if you wish to know more about this).

4 Really, I would like to give you examples of these sounds, but I can’t. They’ve basically disappeared from modern English, though they can, most likely, be found in some dialects of English today. Examples can be found of /e:/ in some variants of Scottish English, for example in mate [me:t], but other than that, I can’t seem to find enough examples. If you do find them, though, please let us know! We would love to know more!

5 See, for example the critique by Charles Jones in A History of English Phonology (1989).

Today’s post is brought to you by the letter G

It’s time for the HLC with our very special guest, Proto-Germanic! Yaaay!

Ah, English spelling. That prickly, convoluted briar patch that, like an obscure Lewis Carroll poem, often falls just a little too shy of making sense. Or does it?

It wasn’t always like this. English spelling actually used to be pretty phonetic. People would just write down what they heard or said.1 Then, the printing press was introduced. Books and pamphlets began to be mass produced, literacy levels rose, and spelling began to be standardized. At the same time, English continued to move through some fairly dramatic shifts in pronunciation. The language moved on as the spellings froze.

Throughout the years, people have occasionally called for reforms in English spelling. Like that time in the early 20th century when Andrew Carnegie, Melvil Dewey, Mark Twain, Theodore Roosevelt, et. al. colluded to “improve” some of the more confusing orthographic practices of English. Personally, this linguist is glad such efforts have by and large failed.

Sure, you could look at English spellings and tear at your hair at the monumental insanity of it all. But I like to think of our spellings more as fossils preserving the dinosaur footprints of earlier pronunciations. Granted, sometimes the footprints are from five different species, all overlapping, and there’s, like, a leaf thrown in.

Where are they all going?!

Let’s take, for example, the letter <g>2 and its many possible pronunciations.

First on the menu is the classic [g], a sturdy stop found in words like grow, good gravy, and GIF. This dish originates in the Proto-Germanic (PGmc) voiced velar fricative /ɣ/3. (Refresh your memory on our phonological mumbo-jumbo here.) This velar fricative had a bit of an identity crisis during Old English (OE)4, spurred on by hanging out with sounds all over the mouth.

“But what we found out is that each one of us is a front vowel…and a back vowel…and a palatal approximant…an affricate…and a voiced velar stop…Does that answer your question?”

Around front vowels (such bad influences—triggering umlaut wasn’t enough for them?), it became [j], as in year, from OE ġēar. Between back vowels (the big bullies), it became [w], as in to draw, from OE dragan5. At the end of words, it lost its voicing and became [x] (the sound in loch), as in our own dear Edinburgh (whose pronunciation has since changed again). Ah, but before back vowels, and when backed up by sonorants like [ɹ], it held its ground a little better and became our trusty [g].

As you may have noticed, a lot of the sounds that came from /ɣ/ are no longer spelled with <g>. Alas. We’ll come back to how Edinburgh wound up with an <h> in a minute.

But first, there was another sound that came from PGmc /ɣ/. Old English had something going on called gemination. Sometimes, it would take a consonant and double its pronunciation. Like the <kk> in bookkeeper. Bookkeeper is just fun to say, but these long consonants were actually important back in OE. The wheretos and whyfors of gemination are another story, but just like how /ɣ/ became [j], the geminate /ɣɣ/ was pulled forward and dressed in new clothes as the affricate [d͡ʒ], like in bridge and edge, from OE bryċg and eċg.

Gemination didn’t get around much. It was pretty much restricted to the middle of words. When mushy, unstressed endings began to fall off, the leftovers of gemination found themselves at the end of words, but a little nudge was needed before [d͡ʒ] found its way to the prime word-initial position. Later on in Middle English, the language ran around borrowing far more than a cup of sugar from its neighbor across the Channel. As English stuffed its pockets with French vocabulary, it found a few French sounds slipped down in among the lint. One of those was Old French’s own [d͡ʒ], which on the Continent was simplifying to [ʒ]6 (the <s> sound in measure). This [ʒ] sound didn’t exist in English yet. Our forefathers looked at it, said “nope,” and went on pronouncing it [d͡ʒ]. Thus we get words like juice, paving the way for later words like giraffe and GIF.

This is a GIF. Or is it a GIF? I mock you with my scholarly neutrality.

It was only later, after the end of Middle English, that /ʒ/ was added to the English phoneme inventory, retaining its identity in loanwords like garage and prestige. It’s worth noting, however, that these words also have accepted pronunciations with [d͡ʒ].

Alright, so what about the <gh> in Edinburgh? It turns out there’s another sound responsible for the unpaid overtime of the letter <g>. Meet the sound /h/. In Middle English, Anglo-Norman scribes from France introduced a lot of new spellings, including <gh> for /h/. The <h> part of the <gh> digraph was probably a diacritic meant to indicate a fricative sound. Remember that by this time, the old <g> didn’t really represent a fricative anymore. In words like Edinburgh, the [x] from /ɣ/ had merged with the [x] version of /h/, so it is from /h/ that we get our <gh> spellings. Over time, these [h] and [x] pronunciations weakened and disappeared completely, bequeathing us their spelling to baffle future spelling bee contestants. We have them to thank for bright starry nights, the wind blowing in the high boughs of the trees. But before these sounds went, they left us one last piece to complete our <g> puzzle: after back vowels, sometimes [x] was reanalyzed as [f]. We’ve all been there, right? Your parents say something one way, but you completely mishear them and spend the rest of your life pronouncing it a different way. I mean, did you know the line in the Christmas song is actually colly7 birds, not calling birds? Now imagine that on a language-wide scale. I’m glad for the [f]s. They make laughing more fun, although sometimes convincing your phone not to mis-autocorrect these words can be rough. Had enough? Okay, I’ll stop.

The point of all this isn’t really about the spellings. Just look at all these beautiful sound changes! And this barely scratches the surface. A lot of the big sound changes that warrant fancy names seem to be all about vowels, but as <g> can attest, consonants have fun, too.8 Speaking of big, fancy vowel changes, get your tickets now because next week, Sabina’s going to talk about one of the most famous and most dramatically named: the Great English Vowel Shift.


1 It wasn’t a perfect system, though. Sometimes, a single scribe would spell the same word several different ways in the same document. Was this reflecting variations in utterances? An inability to decide which letter represented which sound? Transmission errors through copying down someone else’s writing? Who knows.
2 As far as the letter itself goes, the Anglo-Saxons actually used a slightly different symbol known as the insular g. The letter we use today was borrowed from the French during Middle English and is known as the Carolingian g.
3 It’s the voiced version of the sound at the end of Scottish loch. It can be heard today in the Dutch pronunciation of wagon.
4 Refresh yourself on the periods of English here.
5 Actually, draw, drag, and draught/draft are cognates. Knowledge, am I right?
6 This is actually one of my favorite phones. I’m a linguist. I’m allowed to have favorite phones.
7 Because they’re black like coal. And my heart.
8 Admittedly debatable and unnecessarily anthropomorphizing, but we’re already in this thing pretty deep.

Let’s get Laut! 2

Welcome back, fearless blog readers!

If you remember last week’s post, or if you speak English at all, you’ll remember that sometimes English words can behave… bizarrely.

Last time, we explored the reason why some plurals (like mice or geese) can be totally out of control. Today, it’s time to look at their far more complicated cousins, the so-called “irregular” past tense verbs. These are really part of a wider Germanic phenomenon called strong verbs, but their roots sink much, much further in the past. If you’re a native English speaker, maybe you’ve wondered from time to time why some verbs change so drastically in their past tenses; if you are or have been an English learner, you probably remember memorising those frustrating tables in school.

But why? Why are they like this? Why can’t they just be like everyone else?

Remember the two German siblings we introduced last week?

No, not the fairy tale ones. The anthropomorphised linguistic abstraction ones.

They look pretty good for having no discernible physical form at all. Also they like Spätzle and Bratvwürst. Yummy!

We already thoroughly acquainted you with umlaut, and today we’re going to introduce his big sister, ablaut.

Hold on tight, this is going to be a wild ride!

The humble e

If you thought umlaut was old, get a load of this: his older sister ablaut goes back to Proto-Indo-European!

Her name literally means “sound gradation” in German, and she was given a name by none other than our old friend Jacob Grimm.

He (and other linguists during his time) noticed that in some Germanic verbs vowels alternated according to a predictable set of patterns. You might know these patterns as the so-called “irregular” verbs of English, such as swim/swam/swum.

Such patterns exist in all Germanic languages, but our linguist friends noticed that similar phenomena could be seen in other Indo-European languages, and not only in verbs. Ancient Greek, for example, exhibits similar patterns in nouns as well as verbs, and ancient Indian grammarians such as Panini had noticed it happening in Sanskrit millennia before, giving the different vowel grades fancy names such as guna and vrddhi.

From this evidence, our fearless heroes deduced that this system of vowel changes must go much further in the past than the birth of Germanic languages.

Today’s leading hypothesis is that all these changes spark from the same little source: the humble PIE vowel /e/.

This little vowel was PIE’s most important vowel. In fact, according to some theories, it might even have been its only vowel at some very early stage! How did the other vowels come about? Well, /a/ probably originated from a neighbouring consonant’s effect on /e/, while /i/ and /u/ probably arose out of the semivowels /j/ and /w/ respectively. The vowel /o/, on the other hand, came about because of ablaut.

You see, PIE /e/ was pronounced (or not pronounced, see below) in various, different ways depending on its position and the position of the main stress in the word. We call these different ways of pronouncing this most basic of vowels grades. Unfortunately, nobody has ever been able to figure out why this happened exactly, but we’re working on it, we promise.

In total, there were three basic grades and two lengthened grades. Let’s take a look at these changes using various forms of the PIE word *ph2ter-, ‘father’, as examples.[1] In these, the acute accent (é) indicates stress.

The three basic grades were the e-grade, which occurred when the stress was on the concerned vowel, as in

*ph2térm̥ (“father”, accusative)[2]

The o-grade, which turned the /e/ into /o/, and occurred when the stress came before the vowel, as in

*n̥péh2torm̥ (“fatherless”, accusative)[3]

And the zero-grade, where the /e/ just disappeared, which occured when the stress came after the vowel, as in

*ph2trés (“father’s”, genitive)

When the e- and o-grades were found in the last syllable of a word, they became long vowels, giving rise to the lengthened grades (a line on the vowel, called a macron, indicates length), as in

*ph2tḗr (“father”, nominative)


*n̥péh2tōr (“fatherless”, nominative)

Thousands of years of sound change in English have erased the effects of ablaut in nouns, but they can be seen in Ancient Greek. Using our examples above, here’s how they evolved in the language of Socrates:

*ph2térm̥ > patéra

*n̥péh2torm̥ > apátora

*ph2trés > patrós

*ph2tḗr > patḗr

*n̥péh2tōr > apátōr

Pretty similar, aren’t they?

This system of changes also applied to verbs, and, believe it or not, in early PIE all verbs behaved like the English irregular verbs! What a nightmare, eh?

Don’t commiserate the poor Indo-Europeans, though. At the time, these changes were perfectly predictable and regular.

Ten thousand years of sound change tend to wreck even the most clockwork-like of systems, however, and by the time Proto-Germanic made its entrance on the stage, the simple e/o/nothing system of Indo-European had been scrambled into a complex mess of vowels.

Proto-Germanic strong verbs are divided into seven classes, depending on the path that humble PIE /e/ took in its evolution into all the vowels we know and love today.

The… messy evolution of vowels in English certainly didn’t help, and while today these seven classes of verbs still technically exist, they’re very hard to tell apart. The strong verbs of English have become for all intents and purposes irregular, which is what they’re called in school grammars everywhere.

What about regular verbs (also called weak verbs) then? Well, some of them were once strong verbs which became weak somewhere along their history (such as show/showed, which was once show/shew), but most of them were not originally verbs at all! Proto-Germanic weak verbs come from other words (mostly nouns) which got turned into verbs through derivation.

So here’s the plot twist: irregular verbs are not rebels at all! They’re old fogeys, shaking their heads and tutting at the young and hip regular verbs staring at their mobile phones all day.

You millennials are so lazy. Back in MY day we took the trouble of changing our vowels in our past tenses!

Life is full of surprises.

  1. That “h2” thing is one of the consonants from which /a/ arose, incidentally.
  2. That dot under the “m” shows that it’s a separate syllable. In PIE, m, n, l, and r could behave like vowels!
  3. Bonus points if you noticed the e-grade in the first syllable!

Let’s Get Laut! (Part 1)

Mouse. Goose. Man. Swim. Drive. Bite.

These are some words students of English everywhere have learned to fear. Why? Because they’re rebel words: they won’t bow to the rules which would make English grammar so much simpler.

“Mouses”? That’s what the system wants, man! Go “mice”!

“Swimmed”? Pshaw! It’s “swam” or death!

Rise, Товарищ, smash the imperialist suffixes!

But why is it like that? Why can’t these words just behave and spare English students all the grief? Why do their vowels have to jump around like rocket-powered rabbits in a carrot field?

Well, turns out they have two very good reasons to do that, and those reasons are two lovely German siblings called umlaut and ablaut.

Aren’t they cute?

Let’s talk about the first of these for a bit.


Umlaut is the younger sibling: he’s just a little over 1000 years old!

His name literally means “sound alteration” in German, and he is a kind of assimilation or vowel harmony that appeared in two out of the three main branches of the Germanic family, leaving poor East Germanic behind.

Lots of sad goths out there.
Photo by Bryan Ledgard

Vowel harmony is a process in which the vowels of a word shift their sound to become more similar to another vowel, bringing all them roughly in the same part of the mouth (and therefore making it simpler to pronounce them in sequence).

In some languages, such as Finnish or Turkish, this process happens all the time, and vowels on suffixes must be “adapted” to the vowels of the word they are to be attached to to be grammatically sound. For example, the vowels “a” and “ö” cannot be together in any native Finnish word: if you want to add an “a” to a word with “ö” sounds, you have to turn it into “ä” first.

Umlaut is a rather more limited form of vowel harmony, because it usually only extends one syllable to the left in languages in which it appears.

In Germanic, it only happened in the past, and only involved the vowels /a/, /u/ and, most importantly, /i/. In this post, we’re going to concentrate on the umlaut involving the vowel /i/, because it’s the one that most influenced modern English.

If Germanic words were American high-schools (or Japanese ones, depending on your tastes in entertainment), then /i/ would have been the cool kid. Everyone wants to be like /i/: he’s smart, athletic and almost sinfully handsome.

Notice me, senpai!

Whenever he’s around, the back vowels /a/, /o/ and /u/ try to look like him, hoping to attract his attention. They never succeed entirely, no-one can be like /i/, but they come as close as they can. Only /e/ remains aloofː he’s a bookish geek, and doesn’t care about these status games.[1] Also, he’s already pretty similar to /i/, because he possesses the thing that makes /i/ so coolː frontness.

In the classroom of the mouth, /i/ and /e/ always sit in the front rows, near the teeth, while /a/, /o/ and /u/ are confined to the back, near the squishy soft palate. Ew.

When /i/ appears, everyone shuffles their desks forward to be near him. However, they can’t be too conspicuous, or they’ll appear desperate. That’s why they only move forwards if they are within one syllable to his left.

Suppose one of these words looks like this:


Here’s /u/, happily minding its own business. But when the word is plural, it looks like thisː


Well look who appeared on the sceneǃ It’s good ol’ /i/, and he’s right in the next syllableǃ /u/ almost panicsː this is his chance to be seen with the cool kidǃ He shuffles his desk forward and becomes /y/.


Time passes, /i/ and /z/ graduate from the school of language change and disappear from the word. /y/ is behind on a few exams and remains where he is.


He’s really important nowː if he moved his desk back and became /uː/ again, the speakers of the school’s language would not be able to tell the plural of the word from the singularǃ

Eventually, through hard study and the unrounding of front vowels in the passage between Old and Middle English, /y/ finally lives the dreamː he becomes /i/ǃ Now he’s the cool kidǃ


He’s hardly finished celebrating when the Great Vowel Shift sweeps the language like a storm, sending vowels flying all over the place. Now the singular form sounds like /maʊs/, and the plural like /maɪs/. Our words have now becomeː

mouse and mice

And that’s how they’ve looked ever sinceǃ To summarise, /u/’s path when near /i/ was /u/ > /y/ > /i/ > /aɪ/.

The other back vowels also had similar pathsː /o/ > /ø/ > /e/ > /i/ gave rise to words such as goose/geese, and /a/ > /æ/ > /ɛ/ gave rise to the word man/men.

What did the words that make their plural with regular -s have that set them apart from these? Well, it’s simpleː their plurals didn’t involve /i/. Instead, they had some boring other vowel. Usually /a/.

It’s important to note that this process only took place in native Germanic words. That’s why it’s goose/geese, but not moose/meeseː the word “moose” is not Germanic at allǃ It comes from an Algonquian language of Canada, and therefore never went through the umlaut process.

Finally, many words which once formed their plural through umlaut were later regularised to form it with -s. If this hadn’t happened, the plural of cow would be kye, and the plural of book would be… beech.

A veritable library.

So there you have it: that’s why some words in English have crazy plurals. What about the verbs with the crazy past tenses? Well, you’ll have to wait for a future post, when we’ll examine umlaut’s older sister, ablaut.

In the meantime, stay tuned for next week, when Rebekah will start us on a journey on why English spelling looks so bafflingly insane.

  1. Be like /e/, guys.