Thursday, February 26, 2009

'Oldest English words' identified

Some of the oldest words in English have been identified, scientists

Reading University researchers claim "I", "we", "two" and "three" are
among the most ancient, dating back tens of thousands of years.

Their computer model analyses the rate of change of words in English
and the languages that share a common heritage.

The team says it can predict which words are likely to become extinct
- citing "squeeze", "guts", "stick" and "bad" as probable first

"We use a computer to fit a range of models that tell us how rapidly
these words evolve," said Mark Pagel, an evolutionary biologist at the
University of Reading.

"We fit a wide range, so there's a lot of computation involved; and
that range then brackets what the true answer is and we can estimate
the rates at which these things are replaced through time."

Sound and concept

Across the Indo-European languages - which include most of the
languages spoken from Europe to the Asian subcontinent - the vocal
sound made to express a given concept can be similar.

New words for a concept can arise in a given language, utilising
different sounds, in turn giving a clue to a word's relative age in
the language.

At the root of the Reading University effort is a lexicon of 200 words
that is not specific to culture or technology, and is therefore likely
to represent concepts that have not changed across nations or

"We have lists of words that linguists have produced for us that tell
us if two words in related languages actually derive from a common
ancestral word," said Professor Pagel.
“ When we speak to each other we're playing this massive game of
Chinese whispers ”
Mark Pagel, University of Reading

"We have descriptions of the ways we think words change and their
ability to change into other words, and those descriptions can be
turned into a mathematical language," he added.

The researchers used the university's IBM supercomputer to track the
known relations between words, in order to develop estimates of how
long ago a given ancestral word diverged in two different languages.

They have integrated that into an algorithm that will produce a list
of words relevant to a given date.

"You type in a date in the past or in the future and it will give you
a list of words that would have changed going back in time or will
change going into the future," Professor Pagel told BBC News.

"From that list you can derive a phrasebook of words you could use if
you tried to show up and talk to, for example, William the Conqueror."

That is, the model provides a list of words that are unlikely to have
changed from their common ancestral root by the time of William the

Words that have not diverged since then would comprise similar sounds
to their modern descendants, whose meanings would therefore probably
be recognisable on sound alone.

However, the model cannot offer a guess as to what the ancestral words
were. It can only estimate the likelihood that the sound from a modern
English word might make some sense if called out during the Battle of

Dirty business

What the researchers found was that the frequency with which a word is
used relates to how slowly it changes through time, so that the most
common words tend to be the oldest ones.

For example, the words "I" and "who" are among the oldest, along with
the words "two", "three", and "five". The word "one" is only slightly

The word "four" experienced a linguistic evolutionary leap that makes
it significantly younger in English and different from other Indo-
European languages.

Meanwhile, the fastest-changing words are projected to die out and be
replaced by other words much sooner.

For example, "dirty" is a rapidly changing word; currently there are
46 different ways of saying it in the Indo-European languages, all
words that are unrelated to each other. As a result, it is likely to
die out soon in English, along with "stick" and "guts".

Verbs also tend to change quite quickly, so "push", "turn", "wipe" and
"stab" appear to be heading for the lexicographer's chopping block.

Again, the model cannot predict what words may change to; those
linguistic changes are according to Professor Pagel "anybody's guess".

High fidelity

"We think some of these words are as ancient as 40,000 years old. The
sound used to make those words would have been used by all speakers of
the Indo-European languages throughout history," Professor Pagel said.

"Here's a sound that has been connected to a meaning - and it's a
mostly arbitrary connection - yet that sound has persisted for those
tens of thousands of years."

The work casts an interesting light on the connection between concepts
and language in the human brain, and provides an insight into the
evolution of a dynamic set of words.

"If you've ever played 'Chinese whispers', what comes out the end is
usually gibberish, and more or less when we speak to each other we're
playing this massive game of Chinese whispers. Yet our language can
somehow retain its fidelity."

