To boldly split what no one should split: The infinitive.

Lies your English teacher told you: “Never split an infinitive!”

To start off this series of lies in the English classroom, Rebekah told us last week about a common misconception regarding vowel length. With this week’s post, I want to show you that similar misconceptions also apply to the level of something as fundamental as word order.

The title paraphrases what is probably one of the most recognisable examples of prescriptive ungrammaticality – taken from the title sequence of the original Star Trek series, the original sentence is: To boldly go where no man has gone before. In this sentence, to is the infinitive marker which “belongs to” the verb go. But lo! Alas! The intimacy of the infinitive marker and verb is boldly hindered by an intervening adverb: boldly! This, dear readers, is thus a clear example of a split infinitive.

Or rather, “To go boldly”1

Usually an infinitive is split with an adverb, as in to boldly go. This is one of the more recognisable prescriptive rules we learn in the classroom, but the fact is that in natural speech, and in writing, we split our infinitives all the time! There are even chapters in syntax textbooks dedicated to explaining how this works in English (it’s not straightforward though, so we’ll stay away from it for now).

In fact, sometimes not splitting the infinitive leads to serious changes in meaning. Consider the examples below, where the infinitive marker is underlined, the verb it belongs to is in bold and the adverb is in italics:

(a) Mary told John calmly to leave the room

(b) Mary told John to leave the room(,) calmly

(c) Mary told John to calmly leave the room

Say I want to construct a sentence which expresses a meaning where Mary, in any manner, calm or aggressive, tells John to leave the room but to do so in a calm manner. My two options to do this without splitting the infinitive is (a) and (b). However, (a) expresses more strongly that Mary was doing the telling in a calm way. (b) is ambiguous in writing, even if we add a comma (although a little less ambiguous without the comma, or what do you think?). The only example which completely unambiguously gives us the meaning of Mary asking John to do the leaving in a calm manner is (c), i.e. the example with the split infinitive.

This confusion in meaning, caused by not splitting infinitives, becomes even more apparent depending on what adverbs we use; negation is notorious for altering meaning depending on where we place it. Consider this article title: How not to raise a rapist2. Does the article describe bad methods in raising rapists? If we split the infinitive we get How to not raise a rapist and the meaning is much clearer – we do not want to raise rapists at all, not even using good rapist-raising methods. Based on the contents of the article, I think a split infinitive in the title would have been more appropriate.

So you see, splitting the infinitive is not only commonly done in the English language, but also sometimes actually necessary to truly get our meaning across. Although, even when it’s not necessary for the meaning, as in to boldly go, we do it anyway. Thus, the persistence of anti-infinitive-splitting smells like prescriptivism to me. In fact, this particular classroom lie seems like it’s being slowly accepted for what it is (a lie), and current English language grammars don’t generally object to it. The biggest problem today seems to be that some people feel very strongly about it. The Economist’s style guide phrases the problem eloquently3:

“Happy the man who has never been told that it is wrong to split an infinitive: the ban is pointless. Unfortunately, to see it broken is so annoying to so many people that you should observe it.”

We will continue this little series of classroom lies in two weeks. Until then, start to slowly notice split infinitives around you until you start to actually go mad.


I’ve desperately searched the internet for an original source for this comic but, unfortunately, I was unsuccessful. If anyone knows it, do let me know and I will reference appropriately.

This very appropriate example came to my attention through the lecture slides presented by Prof. Nik Gisborne for the course LEL1A at the University of Edinburgh.

This quote is frequently cited in relation to the split infinitive, you can read more about their stance in the matter in this amusing post:

Lies your English teacher told you: “Long” and “short” vowels

I remember, long ago in elementary school, learning how to spell. “There are five vowels,” our teachers told us, “A, E, I, O, U. And sometimes Y.” (“That’s six!” we saucily retorted. (We were seven.))

“When a vowel is by itself,” our teachers continued,”it’s short, like in pat. When there’s a silent e at the end, the vowel is long, like in pate1.” Then there were a dozen exceptions and addenda (including the fact that A could be five different sounds), but the long and the short of it was, there are long vowels and there are short vowels.

And you know something? There are long and short vowels in English. We actually briefly discussed this before, many moons ago during our introduction to vowels, but I wanted to add a little more detail today.

The first important thing to remember is that writing is not equivalent to the language itself.2 Our spellings are generally standardized now, but they are only representations of words, and they do not dictate how a word actually sounds. Furthermore, English orthography uses five or six symbols to represent more than a dozen different vowel sounds (not exactly an efficient system). In our example above of pat and pate, these words actually contain two distinct vowels pronounced in two different places in the mouth. The same is true of the other “long” and “short” vowel pairings. It’s almost like these sounds ([æ] and [eɪ], in IPA) aren’t really related, they just timeshare a spelling.

In another sense, though, it’s not so incorrect to say that pat has a short A and pate has a long A. To illuminate this claim, we’ll need two ingredients: an understanding of vowel tenseness in English, and an important sound change from the language’s past.

For scholars of English, a more important distinction than vowel length is vowel tenseness. Like the long/short vowel spelling distinction, linguists have identified pairs of vowels that are separated by no more than a little difference in quality. The difference, though, is not a matter of length, but whether the vowel is tense or lax, i.e. whether the muscles in the mouth are more tensed or relaxed in the production of the sound. These pairings are based on the sounds’ locations in the mouth and are therefore a little different than those traditionally associated with the letters. Pate and pet demonstrate a tense-lax pairing, as do peek and pick. The sounds in these pairs are very close together in the mouth, pulled apart by the tenseness, or lack thereof, of their pronunciation.

In some dialects of English, like RP or General American, tense vowels (and diphthongs) naturally acquire a longer duration of pronunciation than lax vowels. In short, the tense vowels are long. Therefore, it wouldn’t actually be false to say that pate has a long A and pat has a short A, but the length of the vowels is an incidental feature of English’s phonology and isn’t really the important distinction between the sounds (not for linguists, anyway).

It isn’t always that way in a language, and in fact, it wasn’t always that way in English. We’ve mentioned this before, but it’s pertinent, so I’ll cover it again: in some languages, you can take a single vowel (pronounced exactly the same way, in the same place in the mouth), and whether you hold the vowel for a little length of time or for a longer length of time will give you two completely different words. This is when it become important and appropriate to talk about long and short vowels. Indeed, farther back in English, this was important. In Old English, the difference between god (God) and gōd (good) was that the second had a long vowel ([o:] as opposed to [o], for the IPA fluent). In all other respects, the vowel was the same, what many English speakers today would think of as the long O sound.

In a way, these Old English long/short vowel pairings are really what we’re referring to when we talk about long and short vowels in English today (even if we don’t realize it). The historic long vowels were the ones affected by the Great English Vowel Shift, and the results are today’s colloquially “long” vowels. The short vowels have largely remained the same over the years. Maybe in this sense, as well, it’s not so bad to keep on thinking of our modern vowels as long and short. So many other quirky aspects of English are historic relics; why not this, too?

In the end, maybe the modern elementary school myth of long and short vowels isn’t entirely untrue, but there’s certainly a lot more to the story.


1 This is a delightful, if somewhat archaic, word for the crown of the head. I love language.
2 I imagine some of our longtime readers are fondly shaking their heads at our stubborn insistence on getting this message across. Maybe it’s time we made tee shirts.

Hello dear followers!
Hello dear followers!

Welcome back from summer vacation! Sabina here and, boy, do we have a treat for you today! Today, we’re going to talk runes! When people see this fascinating little writing system, they tend to think of Vikings, so I guess it makes sense that one of our nordic contributors write this post. For me, runes were the initial introduction to linguistics (though I didn’t realise that at the time), and they are still very dear to my heart, so if I get a bit caught up in it, please forgive me.

Though it might make quite a bit of sense to think about Vikings when seeing a runic inscription, the runic writing system actually comes in many varieties and was used in a number of Germanic languages before the Latin alphabet.

First off, let’s check out some things that differ between the Runic writing systems and the one we are using here today (i.e. the Latin alphabet). There are, of course, a number of them, but let’s check out some basic differences for now.

Let’s start with looking at the material on which most Runic inscriptions are found (it’ll be important in a sec, I promise): Rather than paper, most runic inscriptions are found on wood, stone, or even metal. This may just be due to easy access; it was certainly a lot easier to get a hold of a piece of rock than parchment in the days when runic writing was used.

Now, this is where the material becomes important: runes distinctly lack a rounded shape, most of them being angular. One could argue that this may just have been easier to carve into the hard surface, but some believe that the angular shape actually reveals something more about the origins of the runic writing system. You might be thinking, “it must be somewhere in Scandinavia” because you got hung up on Vikings. That, however, may be far from the truth (though, as in most things concerning historical linguistics, we simply can’t know for sure). Some argue that the lack of rounded shapes in the Runic alphabets may be an indication of an Old Italic origin (remember, Latin is an Italic script). Some Old Italic scripts, e.g. Etruscan or Raetic, share this angular property with the Runic alphabets, and some scholars argue that the Runic alphabets are derived from these, probably through early contact between the Germanic languages and the Old Italic ones. Some even believe that runes might actually derive from the Latin alphabet itself. So, while you might be inclined to think that there is a world of difference between the symbols used to write ‘ᚺᛖᛚᛚᛟ’ and ‘hello’, the symbols used in the former may be derived from an ancestor of the latter! (I love writing systems, have I ever said so? Well, it’s worth saying again).

Now, two more things to be noted about the Runic alphabets, before we dig into an overview of the ones that have been used: firstly, in the earliest Runic inscriptions, they didn’t have a fixed writing direction. This means that, unlike our modern script, the earliest Runic inscriptions could be written (and read) either left-to-right or right-to-left (trust me, you want to keep this in mind if you plan to study early runic inscriptions to any great extent. It can get really confusing otherwise, since the writing direction may actually change within the same inscription). It stabilized into a left-to-right pattern later on, though.
Secondly, word division is not commonly used. Basically, itmeansthatrunesarewrittenlikethis. Kinda hard to read, huh? (alright, I was kinda nice to you guys and put in some word division in my hello today but, really, something like this: ᚺᛖᛚᛚᛟᛞᛖᚫᚱᚠᛟᛚᛚᛟᚹᛖᚱᛋ would be more correct) Check out the Franks Casket, an amazing little relic with an Old Norse poem written in runes on it, here to see an example of how this may look. Actually, check out the Casket even if you don’t want to see this specifically; it is still awesome. Sometimes word division was indicated by one or more dots, but that was somewhat unusual.

Now, let’s dig into the most famous Runic alphabets, shall we?

Some of you may think that there was just one kind of runic alphabet – you’re in for a treat! There were, in fact, several. We will mention three today: the Elder Futhark, the Younger Futhark and the Futhorc. Notice the names are very similar? Well, that’s because the alphabets are named after the first six letters, which just happens to spell out ‘futhark’ (or futhorc).

The Elder Futhark is the oldest recorded variety of the runic alphabets, used approximately between the 2nd and 8th centuries AD. It consisted of 24 characters, typically divided into three ættir (compare with Swedish ‘ätter’ meaning ‘family/clan’), each ætt including eight characters, as below.


As you may know, runes were also considered to have certain magical properties, and the very word ‘rune’ means ‘secret’ or ‘mystery’. Though we won’t go into detail here, the first ætt is typically considered to be the ætt of the Norse fertility deities Frey and Freya. The second is the ætt of Heimdall, the guy who watches for the start of Ragnarrök (the end of the world, in case you missed the movie), while the third is considered to be the skygod Tyr’s1.

Now, the Elder Futhark eventually gave way to the Younger Futhark around the 8th century. The Younger Futhark is a reduced version of the Elder Futhark and only contains 16 letters. The Younger Futhark is the Runic alphabet most people think about when we’re talking Viking runes. However, even in the Viking-countries (i.e. the Scandinavian ones), the Younger Futhark varied. In Denmark, we can recognise so called ‘long-branch’ runes:

While in Sweden and in Norway, we see ‘short-twig’ runes:

Let’s complicate it just a liiiittle bit more because in Sweden, you have yet another set called the Bohuslän runes, used specifically in the west coast region (Bohuslän), north of (and including) the city of Gothenburg (coincidently, my hometown). Interestingly enough, this is a set of not 16 letters but 26; 2 more than the original Elder Futhark.

Alright, now that we’ve covered the Elder and Younger Futhark, let’s step over to the Futhorc. Notice the difference in name? Based on what we’ve said previously on language change and the early Germanic dialects, do you think you could guess who used these runes?

Do you have an answer in mind? Is it perhaps the Anglo-Saxons? In that case, you are absolutely right!

The Anglo-Saxon runes, or the Futhorc, is an extended, rather than reduced, version of the Elder Futhark. Instead of the Elder Futhark’s 24 letters, the Futhorc has between 26 and 33 letters (yeah, I know, but I can’t give you a definite number!). How they wound up in the UK (where you can find them on, for example, the Franks Casket mentioned above or the Kingmoor ring, which is inscribed with a magical formula) is still much discussed, though one hypothesis is that it was developed in Frisia. The language of Frisia, Old Frisian, is a closely related kin to Old English and, indeed, we do find that these runes were used also in Old Frisian. Another suggestion is that the Vikings brought them over and the Anglo-Saxons modified them a bit and then spread them to Frisia.

Anyway, the Futhorc was used from approximately the 5th century and was used in England all the way up to the 10th or 11th century. Its use was in decline from about the 7th century, and it largely ceased after the Norman Conquest. Despite this, you can actually see a couple of the old runic symbols tagging along during the Middle English era, as well, specifically the letter wynn <ƿ> and the letter thorn <þ>. Now, while these might look similar, do not mix them up! In Modern English, the former is the letter <w> while the latter is the digraph <th>, so you may get very confused if you do. Also, if you are to read a Middle English manuscript you might come across a letter that looks suspiciously like <y>. Don’t confuse that one, either. It may be either wynn or thorn, and the scribe just missed the line that connects the rounded shape to the vertical line. In fact, this kind of confusion is exactly where we get ‘ye’, as in ‘ye olde’, from.

Right, sidetracked. Getting back to it.

Anyway, the Futhorc looks like this:

Quite a difference from what we saw in the Elder and Younger Futhark, huh? Like everything else in language, variation is the spice of life; it just adds a bit of zest, don’t you think? (Though, admittedly, making it all the more difficult to learn.)

I’ve hammered you with runes for quite a bit today, haven’t I? I did try to restrain myself, honest, but runes are just so awesome, I couldn’t help myself.

Until next time, ladies and gents. I hope you enjoyed our little runic talk! Come back to us in two weeks when our amazing Riccardo will be here to talk to you about the endangered languages of Italy!


Chaos? Nah, just a vowel shift

Dearest creature in creation,
Study English pronunciation.
I will teach you in my verse
Sounds like corpse, corps, horse, and worse.
I will keep you, Suzy, busy,
Make your head with heat grow dizzy.
Tear in eye, your dress will tear.
So shall I!  Oh hear my prayer.
Pray, console your loving poet,
Make my coat look new, dear, sew it!


Just compare heart, beard, and heard,
Dies and diet, lord and word,
Sword and sward, retain and Britain.
(Mind the latter, how it’s written.)
Now I surely will not plague you
With such words as plaque and ague.
But be careful how you speak:
Say break and steak, but bleak and streak;
Cloven, oven, how and low,
Script, receipt, show, poem, and toe.

Finally, which rhymes with enough —
Though, through, plough, or dough, or cough?
Hiccough has the sound of cup.
My advice is to give up!!!


Gosh, English pronunciation can be really tricky at times, can’t it? Interested in knowing why?

Well, of course you are! Let’s dive into it together!

As the excerpt above clearly shows, English spelling is often considered a bit ’off’, poorly corresponding to the written word. That’s true, it often doesn’t. But why is that?

Well, while it is not the only reason behind this tricky correspondence between the spoken and written word, today’s topic does explain a lot: the ‘Great’ English Vowel Shift (let’s stick to calling it the GVS from now on) came along and messed things up quite a bit.

Some of you will probably have heard about the GVS before; it was a significant sound change that occurred primarily during the Middle Ages. This sound change affected the long vowels of Middle English, causing them to shift like so:



Great, so… we done here? You now know everything there is to know about the GVS, right?

Nah, not really.

First, the GVS is actually considered by a lot of linguists to be a process of at least two phases3:

The first phase is considered to have lasted up until approximately the year 1500. During this phase, the long high Middle English vowels /i:/ and /u:/, pronounced similar to the vowels in Modern English meet [mi:t] and lute [lu:t], diphthongised and eventually became the modern English diphthongs /aɪ/ and /aʊ/, the pronunciations you find in mice [maɪs] and mouse [maʊs]. The vowels immediately below them, that is /e:/ and /o:/4, raised one position, falling into the slots previously held by /i:/ and /u:/.

In the second phase, often considered to have been active between the late 16th to mid-17th centuries, the remaining vowels, that is /ᴐ:, a:, ɛ:/, raised one position in height.

What we eventually wind up with is a system of vowels completely changed from its predecessor.

Now, why would that happen?

As with a good number of things in historical linguistics, we don’t exactly know. However, there are two leading hypotheses out there.

The first is the so-called push-chain theory, which was introduced by the great German philologer Karl Luick as early as 1896. Luick argued that the GVS must have been initiated by the movement of the lower vowels /e:/ and /o:/. The two vowels, for some mysterious reason of their own, started to move toward the high vowels /i:/ and /u:/. As they drew nearer, /i:/ and /u:/ started panicking because, it is sometimes argued, they couldn’t raise any higher and remain vowels (instead becoming yucky consonants, bläch).

Well, can’t have that, can we? In pure desperation, /i:/ and /u:/ look for a way out. And they find one—move in (or out, if you will). So, that is precisely what they do, they move in: they become diphthongs, lower and, suddenly, Middle English /i:/ and /u:/ become modern English /əɪ/ and /əʊ/, eventually becoming /aɪ/ and /aʊ/. Tadaa, we have the first steps to a modern English vowel system.

Luick’s hypothesis is actually quite elegant in a way because it successfully explains the lack of diphthongisation of /u:/ in the northern dialects of British English. In these dialects, the vowel /o:/ had previously fronted, becoming /ø:/. The northern dialects therefore didn’t have a vowel /o:/ to push /u:/ out of its place, and the diphthongisation never happened there (pretty neat, huh?).

The second of our hypotheses, the drag-chain theory, was introduced by Otto Jespersen in 1909. Now, Jespersen argued that it was equally likely that the diphthongisation of the high vowels initiated the shift. Basically, Jespersen’s reasoning was like this:

The high vowels, i.e. /i:, u:/, shifted and became diphthongs. That left a ‘gap’ in the vowel system. Horrified, the lower vowels scrambled to move up the ladder to fill the gaps. All of the sudden, Middle English /a:/ became early Modern English /ɛ:/, Middle English /ɛ:/ became early Modern English /e:/ and so on (the back vowels tagged along, too), and so, harmony was restored.

Now, the (to me, at least) flaw of this hypothesis is that it doesn’t account for the non-diphthongisation of northern /u:/, but then again, Luick’s hypothesis claiming that the high vowels couldn’t raise any higher has been noted to be somewhat limited—the high vowels could have done several other things to avoid becoming consonants5. But that’s a different discussion.

Regardless of which of these hypotheses you want to consider more likely, this is the ‘Great’ English Vowel Shift: a huuuuge chain shift that took centuries to complete and affected all long vowels of Middle English. That’s a pretty big deal.

Now, you might be wondering what this has to do with spelling, right? Well, you see, the thing is that English spelling started to become standardized during the ongoing GVS. What this means is that we have a bunch of words where the written form corresponds to a pronunciation that is centuries old. So, basically, meet and meat, both pronounced [mi:t] in British English, are spelled differently because, when those high and mighty people speaking Middle English decided that there was a correct way to spell those words, they did have distinct pronunciations!

So, next time you get annoyed by having to look up how you spell something, just stop and consider that you’re actually spelling the word the way it was pronounced about 600 years ago. Pretty cool, huh?





Oh, oh! I almost forgot! Have you been asking yourself why I keep using ‘’ around ‘Great’? No? Well, I’m going to tell you anyway!

The ‘Great’ was introduced by Jespersen and, frankly, while the GVS did indeed have a huge effect on the English language, vowel shifts happen all the time. So, take the ‘Great’ with a pinch of salt and a shot of tequila and we might get on the right track of things.





Side notes

1.   There is nothing to say that either of these hypotheses is an accurate description on the initial process of the GVS. Long before I took my first bumbling steps into academia (actually, about a year before I was even born), Donka Minkova and Robert Stockwell noted that it may just be the desire to see a systematic aspect of language and discount its random quirks. So, don’t take it too seriously.

2.     If you’d like to read more about the GVS and other hypotheses, please take a look at Gjertrud Flermoen Stenbrenden’s dissertation work The Chronology and regional spread of long-vowel changes in English from 2010. It’s a really interesting read and introduces a lot more on the subject than I could possibly cover here.


Today’s post is brought to you by the letter G

It’s time for the HLC with our very special guest, Proto-Germanic! Yaaay!

Ah, English spelling. That prickly, convoluted briar patch that, like an obscure Lewis Carroll poem, often falls just a little too shy of making sense. Or does it?

It wasn’t always like this. English spelling actually used to be pretty phonetic. People would just write down what they heard or said.1 Then, the printing press was introduced. Books and pamphlets began to be mass produced, literacy levels rose, and spelling began to be standardized. At the same time, English continued to move through some fairly dramatic shifts in pronunciation. The language moved on as the spellings froze.

Throughout the years, people have occasionally called for reforms in English spelling. Like that time in the early 20th century when Andrew Carnegie, Melvil Dewey, Mark Twain, Theodore Roosevelt, et. al. colluded to “improve” some of the more confusing orthographic practices of English. Personally, this linguist is glad such efforts have by and large failed.

Sure, you could look at English spellings and tear at your hair at the monumental insanity of it all. But I like to think of our spellings more as fossils preserving the dinosaur footprints of earlier pronunciations. Granted, sometimes the footprints are from five different species, all overlapping, and there’s, like, a leaf thrown in.

Where are they all going?!

Let’s take, for example, the letter <g>2 and its many possible pronunciations.

First on the menu is the classic [g], a sturdy stop found in words like grow, good gravy, and GIF. This dish originates in the Proto-Germanic (PGmc) voiced velar fricative /ɣ/3. (Refresh your memory on our phonological mumbo-jumbo here.) This velar fricative had a bit of an identity crisis during Old English (OE)4, spurred on by hanging out with sounds all over the mouth.

“But what we found out is that each one of us is a front vowel…and a back vowel…and a palatal approximant…an affricate…and a voiced velar stop…Does that answer your question?”

Around front vowels (such bad influences—triggering umlaut wasn’t enough for them?), it became [j], as in year, from OE ġēar. Between back vowels (the big bullies), it became [w], as in to draw, from OE dragan5. At the end of words, it lost its voicing and became [x] (the sound in loch), as in our own dear Edinburgh (whose pronunciation has since changed again). Ah, but before back vowels, and when backed up by sonorants like [ɹ], it held its ground a little better and became our trusty [g].

As you may have noticed, a lot of the sounds that came from /ɣ/ are no longer spelled with <g>. Alas. We’ll come back to how Edinburgh wound up with an <h> in a minute.

But first, there was another sound that came from PGmc /ɣ/. Old English had something going on called gemination. Sometimes, it would take a consonant and double its pronunciation. Like the <kk> in bookkeeper. Bookkeeper is just fun to say, but these long consonants were actually important back in OE. The wheretos and whyfors of gemination are another story, but just like how /ɣ/ became [j], the geminate /ɣɣ/ was pulled forward and dressed in new clothes as the affricate [d͡ʒ], like in bridge and edge, from OE bryċg and eċg.

Gemination didn’t get around much. It was pretty much restricted to the middle of words. When mushy, unstressed endings began to fall off, the leftovers of gemination found themselves at the end of words, but a little nudge was needed before [d͡ʒ] found its way to the prime word-initial position. Later on in Middle English, the language ran around borrowing far more than a cup of sugar from its neighbor across the Channel. As English stuffed its pockets with French vocabulary, it found a few French sounds slipped down in among the lint. One of those was Old French’s own [d͡ʒ], which on the Continent was simplifying to [ʒ]6 (the <s> sound in measure). This [ʒ] sound didn’t exist in English yet. Our forefathers looked at it, said “nope,” and went on pronouncing it [d͡ʒ]. Thus we get words like juice, paving the way for later words like giraffe and GIF.

This is a GIF. Or is it a GIF? I mock you with my scholarly neutrality.

It was only later, after the end of Middle English, that /ʒ/ was added to the English phoneme inventory, retaining its identity in loanwords like garage and prestige. It’s worth noting, however, that these words also have accepted pronunciations with [d͡ʒ].

Alright, so what about the <gh> in Edinburgh? It turns out there’s another sound responsible for the unpaid overtime of the letter <g>. Meet the sound /h/. In Middle English, Anglo-Norman scribes from France introduced a lot of new spellings, including <gh> for /h/. The <h> part of the <gh> digraph was probably a diacritic meant to indicate a fricative sound. Remember that by this time, the old <g> didn’t really represent a fricative anymore. In words like Edinburgh, the [x] from /ɣ/ had merged with the [x] version of /h/, so it is from /h/ that we get our <gh> spellings. Over time, these [h] and [x] pronunciations weakened and disappeared completely, bequeathing us their spelling to baffle future spelling bee contestants. We have them to thank for bright starry nights, the wind blowing in the high boughs of the trees. But before these sounds went, they left us one last piece to complete our <g> puzzle: after back vowels, sometimes [x] was reanalyzed as [f]. We’ve all been there, right? Your parents say something one way, but you completely mishear them and spend the rest of your life pronouncing it a different way. I mean, did you know the line in the Christmas song is actually colly7 birds, not calling birds? Now imagine that on a language-wide scale. I’m glad for the [f]s. They make laughing more fun, although sometimes convincing your phone not to mis-autocorrect these words can be rough. Had enough? Okay, I’ll stop.

The point of all this isn’t really about the spellings. Just look at all these beautiful sound changes! And this barely scratches the surface. A lot of the big sound changes that warrant fancy names seem to be all about vowels, but as <g> can attest, consonants have fun, too.8 Speaking of big, fancy vowel changes, get your tickets now because next week, Sabina’s going to talk about one of the most famous and most dramatically named: the Great English Vowel Shift.


Is English a creole?

Hi all!

By now, I figure most of you have noticed that when a post shows up at the HLC about the development of the English language in particular, I show up. Today is no exception to the rule (though there will be some in the future)!

Anyway, it’s safe to say that England has been invaded a lot during the last couple of… well, centuries. All this invading and being invaded by non-native people had a tremendous effect on most things English, the English language among them.

This is, of course, nothing new. I’ve previously discussed the question of whether English is a Romance language, but today, we’re going to jump into something different, namely, the question of whether English is a creole.

In order to do that, I’ll first need to say a few words about what a creole actually is, and we’re going to do the basic definition here: a creole is a pidgin with native speakers.

That… didn’t clear things up, did it?

Right, so a pidgin is a form of language that develops between two groups of people who don’t speak the same language but still needed to understand each other for one reason or another.

Typically, in the formation of a pidgin, you have a substrate language and a superstrate language. The substrate is the ‘source’ language. This language is, usually for political reasons, abandoned for the more prestigious superstrate language.

But not completely. Instead, the pidgin becomes a sort of mix, taking characteristics of both the substrate and the superstrate to create a ‘new’ language. A rather distinct characteristic of this new language is that it is typically less grammatically complex than both the sub- and the superstrate language. Another distinct characteristic is that it has no native speakers since it’s in the process of being created by native speakers of two different languages.

But, it can get native speakers. When a new generation is born to pidgin-speaking parents, and the new generation acquires the pidgin as their native tongue, the pidgin ceases to be a pidgin and becomes a creole. So, a creole is a pidgin with native speakers. Typically, a creole becomes more grammatically complex, developing into a new language that is a mix of the two languages that created the pidgin.

But enough of that. Question is: is English a creole?

Well, there are reasons to assume so:

There is a distinct difference between Old English and Middle English, the primary one being a dramatic discrepancy in grammatical complexity, with Middle English being far simpler. As we now know, this is one of the primary features of a pidgin.

There were also politically stronger languages at play during the relevant time periods that just might have affected Old English so much that it was largely abandoned in favour of the other language.

First came the Vikings…


One often thinks about murder and plunder when thinking about the Vikings, but a bunch of them settled in Britain around the 9th century (see Danelaw) and likely had almost daily contact with Old English speakers. This created the perfect environment for borrowing between the two languages.

But see, Old Norse, at least in the Danelaw area, was the politically stronger language. Some people claim that this is the cause of the extreme differences we see when Old English transitions into Middle English.

One of the main arguments for Old Norse as the superstrate is a particular borrowing that stands out. Though English borrowed plenty of words from Old Norse, for example common words like egg, knife, sky, sick, wrong, etc., it also borrowed the third person plural pronouns: they, them, their (compare Swedish de, dem, deras).

This is odd. Why, you ask? Well, pronouns are typically at what we might call the ‘core’ of a language. They are rarely borrowed because they are so ingrained in the language that there is no need to take them from another.

The borrowing of the pronouns from Old Norse implies a deep influence on the English language. Combined with all other things that English borrowed from Old Norse and the grammatical simplification of Middle English, this has led some linguists to claim that English is actually an Old Norse/Old English-based creole.  

We’ll discuss that a bit more in a sec.

After the Vikings, the Brits thought they could, you know, relax, take a deep breath, enjoy a lazy Sunday speaking English…

And then came the French…


Now, here, there’s no doubt that French was the dominant language in Britain for quite some time. The enormous amounts of lexical items that were borrowed from French indicate a period of prolonged, intense contact between the two languages and, again, the grammatical simplification of Middle English in comparison to Old English might be reason enough to claim that Middle English is a creole of Old English and Old French.

And a good number of linguists2 have, indeed, said exactly that. This is known as the Middle English creole hypothesis and it remains a debated topic (though less so than it has been historically).

‘But, Sabina,’ you might ask, ‘I thought you were going to tell me if English is a creole?!’

Well, sorry, but the fact is that I can’t. This one is every linguist (or enthusiast) for themselves. I can’t say that English is not a creole, nor can I say that it is one. What I can say is that I, personally, don’t believe it to be a creole.

And now, I’ll try to tell you why.

It is true that Middle English, and subsequently modern English, is significantly less grammatically complex than Old English. That’s a well-evidenced fact. However, that simplification was already happening before French came into the picture, and even before Old Norse.

In fact, the simplification is often attributed to a reduction of unstressed vowels to schwa (good thing Rebekah covered all of this, isn’t it?) which led to the previously complex paradigms becoming less distinct from each other. Might not have anything to do with language contact at all. Or it might.

The borrowing of Old Norse pronouns is, indeed, unusual, but not unheard of, and studies have shown that the effect of Old Norse on English may not be as significant and widespread as it was believed.

When it comes to French, while an intriguing hypothesis which is well-worth pursuing for leisurely interests, extensive borrowing is not sufficient evidence to claim that a creole has been created. Extensive borrowing occurs all the time among languages in long, intense contact.


Combined with the fact that we have evidence of grammatical simplification before both Old Norse and French came to play a significant role in English, and the trouble we stumble onto when considering the question of when English was ever a pidgin, I personally find both creolization hypotheses unlikely.

However, I encourage you to send us a message and tell us what you think: is English a creole?

Tune in next week when the marvellous Rebekah will dive into the Transatlantic accent!

Is English a Romance language? On language families and relationships

Today, I’m going to talk about language families! When I say this, I believe that most of you will have, on some level, an intuitive hunch about what I mean. If we were to compare a couple of common words found in, for example, Spanish and Italian, we would find that they are often very similar or, in some cases, even identical. Take a look:

Spanish Italian English translation
vivir vivere live
boca bocca mouth
tu you

Similarly, if we were to look at Swedish, Danish and Norwegian:

Swedish Danish Norwegian English translation
leva leve leve live
mun mund munn mouth
du du du you

You see the similarities? Now, why is that, you might wonder. Well, because they are related!

In the linguistic world, related languages are languages that have so much in common that we cannot claim that it is merely due to extensive contact and/or borrowing. These languages, we say, are so similar that there can be no other reasonable explanation than that they descend from a common source: a mother language, as it were. In the case of Spanish and Italian, the mother is Latin, while in the case of Swedish, Danish and Norwegian, the language is Old Norse.

Now, it would be convenient if it stopped there, wouldn’t it? But, of course, it doesn’t. Like any family, the mother also has a mother and other relatives, like siblings and cousins. Old Norse, for example, has its own sisters: Old High German, Old Frisian, Old English, etc., which all share the same mother: Proto-Germanic. This is the Germanic language family.

Spanish and Italian also have sisters: French, Portuguese, Romanian, etc., and their common mother is Latin. This is the Romance language family, deriving from Vulgar Latin. But, of course, Latin has its own sisters, for example Umbrian and Oscan, and together with its sisters, Latin forms the Italic language family.

Does it feel a bit confusing? Well, that’s understandable and I’m going to kick it up a notch by adding that the Italic language family, with languages like Spanish and Italian, and the Germanic language family, with languages like Swedish and Danish, actually have the same mother: Proto-Indo-European (or just Indo-European).

The mother in this case is veeeery old, and we actually don’t have any kind of evidence of how it looked! Indo-European is a reconstructed language, more commonly known as a proto-language (as you may have noticed, we call the mother of the Germanic family Proto-Germanic, meaning that it is also a reconstructed language). It has never been heard, never been recorded and no one speaks it. Then how the heck do we know anything about it, right? Well, that has to do with something called the comparative method, which we’ll explain in another post.  

Like human families, language families can be represented in the form of a family tree:*

Clear? Well, hate to tell you this, but this is an extremely simplified version using only examples from these two subfamilies. The “real” Indo-European language family tree looks somewhat more like this:1

You’re kinda hating me right now, aren’t you?

As you can see by the tree above, some languages that you might never expect are actually related. Let’s take as an example Standardised Hindi and German. Here are some common words in both languages:

German Hindi English translation
Mädchen लड़की (ladakee) girl
Hallo नमस्ते (namaste) hello
Hunger भूख (bhookh) hunger

Looking at these words, it is unlikely that you would draw the conclusion that the two languages are related. Looking at the language tree, however, you can see that linguists have concluded they are. Now, you’re probably staring at your screen going “whaaaat?” but, indeed, they are both descendants of Indo-European and are therefore related.

While Indo-European is clearly a large group of languages, it is not the only one (or even the largest). Looking a bit closer at the Indo-European language family, you will notice that languages such as Mandarin and Finnish are not included. These belong to other families, in this case the Sino-Tibetan and Finno-Ugric (or Uralic, depending on your definition) language families respectively.

All in all, there are approximately 130 language families in the world today. Some are related, some are not, just like we are. The largest family is the Niger-Congo language family, having (as recorded in 2009) 1,532 languages belonging to it. (Indo-European comes in a poor 4th place with approximately 439 languages.)2

So, looking at languages is kinda like looking at your own family tree: every mother will have a mother (or father, if you want, but traditionally, linguists call them mothers and daughters). Some branches will have siblings, cousins, second cousins and so on. Some will look nothing like their relatives (or, well, little anyway) and some will be strikingly similar. That’s just the way families work, right?

So, now, we’ve reached a point where I can answer the question in the title: Is English a Romance language?

While this is a much-debated question (do a google search and see for yourself), the simple answer is: no, it’s not. At least, not to a linguist. Now, you might be sitting at home, getting more and more confused because a lot of English vocabulary can be traced back to Latin (the word ‘vocabulary’ being one of those words, actually).

But when linguists say that a language is a Romance language, we are referring to the relationship illustrated in the tree structure, i.e. the language has Latin as its mother. English, then, despite having borrowed a substantial part of its vocabulary from Latin (and later from the Latin language French), it is not in itself a daughter of Latin. English is a daughter of Proto-Germanic, thus, it is a Germanic language.

However, Latin and Proto-Germanic are both daughters of Indo-European. Latin and English are therefore clearly related, but the relationship is more like that of a beloved aunt rather than a mother (if, you know, the beloved aunt refused to recognise you as a person unless you imitated her).

At the end of the day, languages are like any other family: some relationships are strong, some are weak, some are close, some are not.

Tune in next week when Riccardo will delve into another branch of language families: constructed languages.

Happy Holidays from the HLC

We here at the Historical Linguist Channel would like to wish you happy holidays. Whether you celebrate Christmas, Hanukkah, Yule, or nothing at all this time of year, whether your New Year comes with 1 January or the first new moon, we hope the rest of December treats you right.

We’re going to pause the semi-serious linguistics for a few weeks to spend time with our loved ones. We’ll be back 4 January with Phonology 101 and more, and in the meantime, Fun Etymology Tuesdays will continue uninterrupted over on our Facebook page.

As our gift to you, here’s a topical story from the history of English:

Once upon a time (let’s call it 1536), a poor guy named William Tyndale was executed for heresy after a merry chase across Europe that abruptly came to an end when he was betrayed in Belgium. His crime? Translating the Bible into English.

The charge of heresy was completely silly and unfair for several reasons:

  1. The Bible was already available in most of the other major languages of Europe.
  2. Two years later, King Henry VIII, the very same who had so adamantly insisted that Tyndale be apprehended, authorized an official English translation of the Bible; it drew heavily from Tyndale’s translation, as did the famous translation later commissioned by King James I.
  3. The Bible had been translated into English before, some of it probably translated by King Alfred himself. (That would be Alfred the Great. And he was. Great. At least, I think so (Hi, this is Rebekah).) Of course, this was before-before—before William and his Norman-French clerics and his Norman-French nobles and their beardless Norman culture.1 (I don’t actually have any beef with William the Conqueror. The dude was a beast, and honestly? England was kind of a mess when he showed up. But that’s neither here nor there. The point is that the Anglo-Saxons were having a grand old time running around translating the Bible and handing it out to everybody long before Henry VIII got all snippy and execution-y just because William Tyndale called him out on the fact that annulling his marriage to Catherine of Aragon wasn’t exactly copacetic vis-a-vis scripture.)

Old English glosses and translations of the Bible were mostly based on the Vulgate Latin Bible. Many of the translations were incomplete, but one translated passage tells a little story you may have heard before:


Soþlice2 on þam dagum wæs geworden gebod fram þam Casere Augusto
Truly3 in those days happened a command from that Caesar Augustus

þæt eall ymbehwyrft wære tomearcod.
that all the circle of the world was to be described.

Þeos tomearcodnes wæs æryst geworden fram þam deman Syrige Cirino
This census first happened by that governor of Syria Cirinus

and ealle hig eoden and syndrie ferdon on hyra ceastre.
and they all went and separately traveled into their city.

Ða ferde Iosep fram Galilea of þære ceastre Nazareth
Then traveled Joseph from Galilee out of that city Nazareth 

on Iudeisce ceastre Dauides seo is genemned Bethleem
into the Judean city of David which is named Bethlehem

forþam þe he wæs of Dauides huse and hirede.
because he was of David’s house and family.

He ferde mid Marian þe him beweddod wæs and wæs geeacnod.
He traveled with Mary who was married to him and was pregnant.


It’s Luke 2, the account of Christ’s birth, in the language of the Anglo-Saxons. A translation of a translation, from Ancient Greek to Latin to Old English. The language tells as much of a story as the words do. For example, they call the world a circle because that’s what they thought it was: a flat disk. In some ways, it’s impossible to separate our language from our culture, or our culture from our language. Our languages convey things that, like music or art, are sometimes a little bit untranslatable (which is how your friendly neighborhood linguists got into a discussion the other day about whether certain Disney songs are better in English or Swedish).

Do you have any Christmas or Hanukkah or Saturnalia (or whatever) stories you’d like to share with us? Any stories or songs that just don’t sound right if you try to translate them? We’d love to hear from you! Comment or send us an email or message in the language of your choice (even if you suspect we don’t speak it).

See you in January!


Old English ain’t Shakespeare (feat. Dinosaurs)

Yes, hello. Rebekah, 26, American. I can hardly contain myself, so let’s just get straight to it:

When I was a teenager, one of my favorite things was the part of the dictionary where it tells you the history of the word. “And Latin bos begat Old French boef, and Old French boef begat English beef.”1 (Okay, that’s not how they phrase it. Also, this area of study is called etymology.) Then, my senior year in high school, while I was applying to colleges, I learned you could actually major in that. Somehow, I had never heard of linguistics before.

Of course, there’s a whole lot more to linguistics than just where words come from. There’s how the words fit together to form sentences, and there’s the 7,000+ languages in the world and how they’re alike and how they’re not, and there’s all these crazy sounds our mouths can make to combine in a billion different ways and become human speech.

I was taking a class on the history of English when I had my eyes-meeting-across-a-crowded-room, have-we-met-before, do-you-think-this-is-destiny moment. I was doing the assigned reading on Old English, and it was all about Saxons and the Danelaw and Alfred the Great and scops, and something about it all reverberated in the marrow of my bones. It was like hearing a song I’d forgotten a long time ago. A thousand-odd years of history collapsed in on itself, and I could feel the blood of my Anglo-Saxon forebears humming through me. (Too much? Too much. Moving on.)

It was only when I went to share this indescribable feeling with everyone I met that I realized I had a problem. The conversation went like this:

Me: I love Old English! *heart eyes, preparing to gush*
Them: Oh, that’s cool. So you like Shakespeare?
Me: *wilting and dying inside*

Don’t get me wrong, I do love Shakespeare. But here’s a super cool linguistic fun fact: Shakespeare’s language, and the language of the King James Bible, and the language of all those other historic sources inspiring your friendly local Renaissance festival players, that’s a little something we linguists like to call “Early Modern English.”

The periods of English

Let’s talk about dinosaurs. Everybody loves dinosaurs, right? Between the chicken nuggets, the tee shirts, and movies like The Land Before Time and Jurassic Park, most people know the names of at least two or three, and they probably have a favorite. (Mine’s triceratops, if you’re wondering.)

Dinosaurs lived during the Mesozoic Era, a 186-million-year period of geological time further subdivided into the Triassic, Jurassic, and Cretaceous periods.2 I’m about to painfully rewrite your childhood, so sorry in advance. Littlefoot, lovable hero of The Land Before Time, was either a brontosaurus or an apatosaurus. These titanic, long-necked herbivores lived in the Late Jurassic. Cera, Littlefoot’s triceratops best friend, would have lived during the Late Cretaceous—some 77 million years later. As long-distance, time-traveling romances go, it’s arguably a little more problematic than The Lake House. Not least because dinosaurs didn’t have mailboxes.

I know what you’re thinking: “Great, Rebekah. That’s just great. Friendship over. Before I delete your number, what does this have to do with linguistics? Are you trying to tell me dinosaurs spoke English?”

As appealing as it is to imagine all our favorite dinosaurs living together as one big happy family, 186 million years is a long time for everything to stay the same. Likewise, as easy as it is to think that English is English, always has been and always will be, languages grow and evolve, too. (Sabina talked about this a little last week.) No matter how different they became, though, from the time they emerged in the Late Triassic until they disappeared at the end of the Cretaceous, dinosaurs were still dinosaurs. It’s kind of the same with languages.

A lot of the dinosaur species people are most familiar with—triceratops, hadrosaurs, velociraptors, and Tyrannosaurus rex, to name a few—lived during the last period, the Cretaceous (yep, Jurassic Park is a bit of a misnomer). This was the period of greatest dinosaur diversity. The latest period of English is called Modern English, and it’s the one you’re probably most familiar with. It started in roughly the late 1400s and runs up to the present. This, too, is a period of impressive diversity, with distinct varieties of English spoken around the world, from Australia to Canada, from India to England, and everywhere in between. As far as literature goes, a lot of the famous English-language works considered part of the Western canon were written during this time, including the works of William Shakespeare, Charles Dickens, Mark Twain, and many others. There are also contemporary works like those of Stephen King, Nicholas Sparks, and Dr. Seuss—all those books, magazines, and newspapers filling up your local library (if you happen to live in an English-speaking country).

Of course, no matter how awesome it would be to see a rap battle between Shakespeare and Dr. Seuss, even the casual reader will flag their writing as seeming like not quite the same language. As mentioned earlier, Modern English can be separated into Early and Late, with the divide being marked at about 1800. Period distinctions like this are the result of shifts in grammar, pronunciation, and word stock throughout the language, though the specific dates often coincide with historical events that had a widespread impact on culture. (Like the mass extinction events that separate the different periods of the Mesozoic Era. But somewhat less catastrophic.) In the case of Modern English, the starting point is often cited as 1476, the year William Caxton introduced the printing press to England. The ability to mass produce written materials would have a profound effect on literacy and the dissemination of linguistic features. In 1776, the American colonies declared independence from England. Some consider the American Revolution the start of Late Modern English and a period of globalization for the language, as over the following decades the British continued to spread their language, colonizing places like Australia, South Africa, New Zealand, and India.

As useful as dates like these can be for roughly marking linguistic time, languages unfortunately don’t work like that. The line between one stage of English and another isn’t as clear cut as turning over a page of your Gregorian calendar on January 1st and magically finding yourself in a new year. Linguistic shockwaves and subtle nudges take time to spread. A great example of this is Middle English.

On our timeline, Middle English is our Jurassic period. During the Jurassic, dinosaurs began to flourish. They hadn’t yet reached the height of diversity of the Cretaceous, but there are still some Jurassic species everybody recognizes, like the stegosaurus or aforementioned sauropods like the brontosaurus. There’s at least one big Middle English name you’ll recognize, too: Geoffrey Chaucer. If you’ve read just one work that predates the Modern English period, I’d bet good money it was some portion of Chaucer’s seminal Canterbury Tales. See? You knew there was English older than Shakespeare’s, even if you didn’t know you knew it. The Canterbury Tales begins:

WHAN that Aprille with his shoures soote
The droghte of Marche hath perced to the roote,
And bathed every veyne in swich licour,
Of which vertu engendred is the flour;3

It might look a little odd and incomprehensible, but with just a little elbow grease, most people can puzzle Chaucer out. (It helps to read it out loud.)

Chaucer died in 1400, and his language was that of the latter end of Middle English. Works from Early Middle English are rare, but one very important one is the Peterborough Chronicle, a historical record periodically updated with the important events of each year up to 1154. It only takes a little squinting to recognize Chaucer’s language as an earlier form of English, but the Peterborough Chronicle starts to look like it was written in a different language entirely. If Chaucer was writing in a kind of pre-Shakespeare, the Peterborough Chronicle was written in a kind of post-Anglo-Saxon, two ends of a transitionary continuum. Due to the nature of the Peterborough Chronicle itself, we can watch the language gradually change in the time between entries.

And so, we come at last to true Old English. The Triassic period, I guess? (Look, I can only push this metaphor so far.) The transition from Old to Middle English is traditionally marked by the Norman Conquest of England in 1066. William the Conqueror became William I, and he repopulated the court and the clergy with French-speaking Normans. The sovereignty of French men, French culture, and the French language had a profound effect on English, explaining the rather Romance sound of the language today. Strip that influence away, go back to England between AD 500 and AD 1000, and you’ll find the very Germanic origins of the language we call English. The most famous of all the surviving Old English works is the epic poem Beowulf. It begins like this:

Hwæt we Gar-Dene     in geardagum,
þeodcyninga     þrym gefrunon,
hu þa æþelingas     ellen fremedon.4

It reads something along the lines of:

Lo, we of the Spear-Danes in days of yore,
learned by inquiry of the kings of the people,
how those princes did valor.

This was the language of the Germanic tribes who migrated to Britain and displaced the Celts, the peoples who would become the Anglo-Saxons. The Beowulf poem began as part of an oral tradition and was later written down. In style and content, it’s somewhat like the Norse Eddas, which perhaps isn’t surprising considering the Anglo-Saxons shared a Germanic heritage with the Vikings and continued to have contact with them after settling Britain (both friendly and not so friendly). Old English manuscripts show a people transitioning from paganism to Christianity, a warlike people with an awful lot of synonyms for “sword” and “kill,” but also a cultured people with a sophisticated poetic meter and a penchant for alliteration. Shakespeare was a long way down the road.

Back to the future

The story of English is far from over. It’s still being written all around us. As I said, language is in constant flux, and it can be hard to know when to say, “Hang on a second. I think we’ve stumbled into a new stage of English!” Linguists today are even starting to distinguish the most current English, the one we’re speaking right now (and tweeting at each other and scribbling down on post-it notes and dropping in beats in epic rap battles), with the appellation Present Day English, leaving Shakespeare and Dickens and all the rest a little farther in the past.

Don’t think this phenomenon is unique to English. Other languages have gone through some incredible changes, too. Old French boef eventually became French boeuf, and really, French is just grown up Latin, just like Spanish, Italian, Portuguese, and all the other Romance languages. (Language families are a subject for another day.) And language, all language, is going to go right on changing as our cultures and our communication needs go right on changing. To paraphrase Jurassic Park, “Language finds a way.”

Next week with Lisa: As hard as it is to say when a language has entered a new stage of its evolution, one of the most complicated questions facing linguists is the problem of where to draw the distinction between a language and a dialect. What makes something a separate language rather than just a variety of another? When do we say a dialect has diverged enough from its parent language to be considered a language in its own right?


