Indo-European Language Family Has Roots in Anatolia

Mark · August 29, 2012, 08:57:47 AM

http://www.nytimes.com/2012/08/24/science/indo-european-languages-originated-in-anatolia-analysis-suggests.html?_r=1

Biologists using tools developed for drawing evolutionary family trees say that they have solved a longstanding problem in archaeology: the origin of the Indo-European family of languages.

The family includes English and most other European languages, as well as Persian, Hindi and many others. Despite the importance of the languages, specialists have long disagreed about their origin.

Linguists believe that the first speakers of the mother tongue, known as proto-Indo-European, were chariot-driving pastoralists who burst out of their homeland on the steppes above the Black Sea about 4,000 years ago and conquered Europe and Asia. A rival theory holds that, to the contrary, the first Indo-European speakers were peaceable farmers in Anatolia, now Turkey, about 9,000 years ago, who disseminated their language by the hoe, not the sword.

The new entrant to the debate is an evolutionary biologist, Quentin Atkinson of the University of Auckland in New Zealand. He and colleagues have taken the existing vocabulary and geographical range of 103 Indo-European languages and computationally walked them back in time and place to their statistically most likely origin.

The result, they announced in Thursday's issue of the journal Science, is that "we found decisive support for an Anatolian origin over a steppe origin." Both the timing and the root of the tree of Indo-European languages "fit with an agricultural expansion from Anatolia beginning 8,000 to 9,500 years ago," they report.

But despite its advanced statistical methods, their study may not convince everyone.

The researchers started with a menu of vocabulary items that are known to be resistant to linguistic change, like pronouns, parts of the body and family relations, and compared them with the inferred ancestral word in proto-Indo-European. Words that have a clear line of descent from the same ancestral word are known as cognates. Thus "mother," "mutter" (German), "mat' " (Russian), "madar" (Persian), "matka" (Polish) and "mater" (Latin) are all cognates derived from the proto-Indo-European word "mehter."

Dr. Atkinson and his colleagues then scored each set of words on the vocabulary menu for the 103 languages. In languages where the word was a cognate, the researchers assigned it a score of 1; in those where the cognate had been replaced with an unrelated word, it was scored 0. Each language could thus be represented by a string of 1's and 0's, and the researchers could compute the most likely family tree showing the relationships among the 103 languages.

A computer was then supplied with known dates of language splits. Romanian and other Romance languages, for instance, started to diverge from Latin after A.D. 270, when Roman troops pulled back from the Roman province of Dacia. Applying those dates to a few branches in its tree, the computer was able to estimate dates for all the rest.

The computer was also given geographical information about the present range of each language and told to work out the likeliest pathways of distribution from an origin, given the probable family tree of descent. The calculation pointed to Anatolia, particularly a lozenge-shaped area in what is now southern Turkey, as the most plausible origin — a region that had also been proposed as the origin of Indo-European by the archaeologist Colin Renfrew, in 1987, because it was the source from which agriculture spread to Europe.

Dr. Atkinson's work has integrated a large amount of information with a computational method that has proved successful in evolutionary studies. But his results may not sway supporters of the rival theory, who believe the Indo-European languages were spread some 5,000 years later by warlike pastoralists who conquered Europe and India from the Black Sea steppe.

A key piece of their evidence is that proto-Indo-European had a vocabulary for chariots and wagons that included words for "wheel," "axle," "harness-pole" and "to go or convey in a vehicle." These words have numerous descendants in the Indo-European daughter languages. So Indo-European itself cannot have fragmented into those daughter languages, historical linguists argue, before the invention of chariots and wagons, the earliest known examples of which date to 3500 B.C. This would rule out any connection between Indo-European and the spread of agriculture from Anatolia, which occurred much earlier.

"I see the wheeled-vehicle evidence as a trump card over any evolutionary tree," said David Anthony, an archaeologist at Hartwick College who studies Indo-European origins.

Historical linguists see other evidence in that the first Indo-European speakers had words for "horse" and "bee," and lent many basic words to proto-Uralic, the mother tongue of Finnish and Hungarian. The best place to have found wild horses and bees and be close to speakers of proto-Uralic is the steppe region above the Black Sea and the Caspian. The Kurgan people who occupied this area from around 5000 to 3000 B.C. have long been candidates for the first Indo-European speakers.

In a recent book, "The Horse, the Wheel and Language," Dr. Anthony describes how the steppe people developed a mobile society and social system that enabled them to push out of their homeland in several directions and spread their language east, west and south.

Dr. Anthony said he found Dr. Atkinson's language tree of Indo-European implausible in several details. Tocharian, for instance, is a group of Indo-European languages spoken in northwest China. It is hard to see how Tocharians could have migrated there from southern Turkey, he said, whereas there is a well-known migration from the Kurgan region to the Altai Mountains of eastern Central Asia, which could be the precursor of the Tocharian-speakers who lived along the Silk Road.

Dr. Atkinson said that this was a "hand-wavy argument" and that such conjectures should be judged in a quantitative way.

Dr. Anthony, noting that neither he nor Dr. Atkinson is a linguist, said that cognates were only one ingredient for reconstructing language trees, and that grammar and sound changes should also be used. Dr. Atkinson's reconstruction is "a one-legged stool, so it's not surprising that the tree it produces contains language groupings that would not survive if you included morphology and sound changes," Dr. Anthony said.

Dr. Atkinson responded that he did indeed run his computer simulation on a grammar-based tree constructed by Don Ringe, an expert on Indo-European at the University of Pennsylvania, but that the resulting origin was, again, Anatolia, not the Pontic steppe.

Mark · August 29, 2012, 08:58:51 AM

http://www.chicagotribune.com/news/plus/chi-nsc-20120827--the-tree-of-knowledge-linguistic-archeology,0,3935656.story

The tree of knowledge: Linguistic archeology

Trees are a gift to students of the past. An entire discipline, known as dendrochronology, is devoted to using tree rings to date ancient wooden objects and buildings. Linguistic archaeologists, it seems, share these arboreal inclinations, though the trees they examine are of an altogether different species.

In 2003 a team led by Quentin Atkinson, of the University of Auckland, in New Zealand, employed a computer to generate a genealogical tree of Indo-European languages. Their model put the birth of the family, which includes languages as seemingly diverse as Icelandic and Iranian, between 9,800 and 7,800 years ago. This was consistent with the idea that it stemmed from Anatolia, in modern-day Turkey, whence it spread with the expansion of farming. A rival proposal, that their origin amid the semi-nomadic, pastoralist tribes in the steppes north of the Caspian Sea, supposes their progenitor to be several thousand years younger.

Some proponents of the steppe hypothesis remained unconvinced. They pointed out that the computer-generated phylogeny, to give the tree its technical name, showed only how Indo-European tongues evolved over time. It said nothing about how they spread across space. As Dr Atkinson and his colleagues report in Science, this issue has now been addressed. The results lend further credence to the Anatolian theory.

Linguistic archaeologists have even less to go on than their peers in other past-oriented disciplines, who can at least pore over the odd trinket for clues to mankind's prehistoric ways. The earliest written records date back less than 6,000 years, long after "proto-Indo-European" is believed to have emerged. Researchers do, however, enjoy an abundance of data about contemporary languages. Because tongues change less chaotically than other aspects of culture, this is more useful to someone studying linguistic prehistory than it might appear.

Dr Atkinson began by collecting basic vocabulary terms—words for body parts, kinship, simple verbs and the like—for 83 modern languages as well as 20 ancient ones for which records are available. For each family, Dr Atkinson and his team identified sets of cognates. These are etymologically related words that pop up in different languages. One set, for example, contains words like "mother", "Mutter" and "mere". Another includes "milk" and "Milch", but not "lait". (Here is the whole list; known borrowings, such as "mountain" and "montagne" were excluded, as they do not stem from a common ancestor.) Then, for each language in their sample, they added information about where it is spoken—or is thought to have been, based on where ancient texts were discovered—and in what period. The result is a multidimensional Venn diagram that records the overlaps between languages.

Each of the 103 languages, with its cognate sets, temporal and geographical range, constituted one leaf of the Indo-European family tree. The tricky part was filling in the branches. Here, Dr Atkinson resorted to rolling of the dice, using a method called Markov-chain Monte Carlo. This generates a random set of boughs (each assigned its own randomly generated cognate sets, time and place) that fits the known foliage. Next, an algorithm calculates how likely it is that this tree would sprout the modern leaves given the way languages evolve and travel. For instance, it is assumed that a cognate can only be gained once, by an ancestral language, but lost many times, whenever it disappears from any of the descendants. And languages, or at least their speakers, might migrate in any direction, though less readily across water or mountain ranges, say, than through plains and valleys.

The first rolls of the dice are unlikely to offer a good fit. They might, for example, have Icelandic and Iranian as siblings, as opposed to distant cousins. So the algorithm tweaks the tree, again at random, and decides whether the new branches are any better. If so, they are kept; if not, the algorithm reverts to the previous tree in the series. Repeat this process long enough, typically millions of times, and a point is reached where no further improvement is possible. Let a forest of such equally likely trees grow, then look at the number of those with roots in Anatolia and the steppes. The proportions reflect the relative likelihood that either of the hypothesis is correct.

Dr Atkinson's findings leave much less room for doubt. The Anatolia-rooted trees are orders of magnitude more numerous than those growing out of the steppes (see picture; an animated version of Indo-European peregrinations is available here). The researchers verified the method's validity by getting it to retrace the evolution of modern romance languages from its Roman roots. The model returned an accurate reconstruction, closely in keeping with historical records. In linguistics, then, cultivating trees pays. So does a bit of gambling.

Duncan Head · August 29, 2012, 10:37:15 AM

The BBC version is at http://www.bbc.co.uk/news/science-environment-19368988 and the Science paper, if anyone has access, at http://www.sciencemag.org/content/337/6097/957.abstract?sid=192102e8-a5bc-4744-ac5a-5500338ab381.

I've never been convinced by the Renfrewist Anatolian origin previously, but this gives food for thought.

I don't see the force of the Tocharian objection. If IE originated in 9000 BC in Anatolia, but had spread to the Pontic steppe by 5000 BC or so - rather than originating there - then the Tocharians still have plenty of time for the migration that Anthony requires.

aligern · August 29, 2012, 10:16:29 PM

It seems odd that the archaeologists would see the Tokharians as a problem because 5000 years is plenty of tem for the T to get to Southern Russia from Anatolia to be in place for a migration to North China.
Are there indo European languages that get their chariot and wheel terminology from another root? It might well be that the I Es that developed the chariot spread it to all the other IE units rather than the language as a whole spreading at that time.
My sceptical side would worry that there is a political agenda here. Indo Europeans were represented as a chariot riding, conquering aristocracy. Restructuring them as peaceful farmers would be a considerable change to that image.

Roy

tadamson · August 30, 2012, 09:38:37 AM

It's a very interesting article. The stats and methodology seem sound, though many of the fundamentals are 'best guess' as there is no evidence to guide the researchers.

One side issue that might be of interest to our membership is that chariots and the associated linguistic terms pre-date the Indo-Europeans, this study requires that chariot using steppe nomad groups must have existed much earlier than the archaeological records can currently show..

Tom..

Duncan Head · August 30, 2012, 10:23:26 AM

Quote from: tadamson on August 30, 2012, 09:38:37 AMOne side issue that might be of interest to our membership is that chariots and the associated linguistic terms pre-date the Indo-Europeans, this study requires that chariot using steppe nomad groups must have existed much earlier than the archaeological records can currently show..

I don't follow. This study puts Indo-Europeans back to 9000 BC: chariots aren't quite that old, surely?

tadamson · August 30, 2012, 01:37:27 PM

Quote from: Duncan Head on August 30, 2012, 10:23:26 AM
Quote from: tadamson on August 30, 2012, 09:38:37 AMOne side issue that might be of interest to our membership is that chariots and the associated linguistic terms pre-date the Indo-Europeans, this study requires that chariot using steppe nomad groups must have existed much earlier than the archaeological records can currently show..
I don't follow. This study puts Indo-Europeans back to 9000 BC: chariots aren't quite that old, surely?

True, archaeology isn't nearly that far back BUT amongst the core words in the study are those for wheel, chariot, axle etc giving the premise that they were adopted before the Indo-Europeans break up and head out from Anatolia.
An alternative is that the chariot riders made such a massive impact that their terminology and technology were adopted wholesale by ALL the Indo-European groups. This has close parallels with Chinese adoption of the technology/terminology but rather dent's the validity of the exercise itself.
- sometimes this stuff gets too cyclic and just gives me headaches :-(

Andreas Johansson · August 30, 2012, 02:00:03 PM

Quote from: tadamson on August 30, 2012, 01:37:27 PM
True, archaeology isn't nearly that far back BUT amongst the core words in the study are those for wheel, chariot, axle etc giving the premise that they were adopted before the Indo-Europeans break up and head out from Anatolia.
An alternative is that the chariot riders made such a massive impact that their terminology and technology were adopted wholesale by ALL the Indo-European groups. This has close parallels with Chinese adoption of the technology/terminology but rather dent's the validity of the exercise itself.

I can't imaging that alternative working. The charioteering words show the expected sound-changes within the various branches (unlike the many inter-IE loans that have been (and still are being) exchanged between the branches after they split), so either you'd have to assume the various branches remained so extremely similar that loans can't be distinguished from cognates for thousands and thousands of years, only to suddenly start rapidly diverging after the chariot revolution, or the charioteers were consummate linguists who retrofited the words phonologically just to mess with our heads.

ETA: In other words, IE isn't 11,000 years old, or chariots are. (Or at least wheel vehicles are.)

SoA Forums

News:

Indo-European Language Family Has Roots in Anatolia

Mark

Mark

Duncan Head

aligern

tadamson

Duncan Head

tadamson

Andreas Johansson