Indo-European Origins in Southeast Europe
by Dienekes Pontikos

Last Update: 2 May 2008
Igor M. D’iakonov

One of the many rival theories of Indo-European Origins proposes that the homeland of the speakers of the Proto-Indo-European language is to be found in the Balkan peninsula (Southeast Europe). This theory was most comprehensively proposed by the eminent Russian linguist and historian Igor M. D’iakonov in his seminal paper [(1985). “On the Original Home of the Speakers of Indo-European.” Journal of Indo-European Studies. Volume 13, p. 92]

D’iakonov argues quite convincingly against the two main rival theories, that of feminist Lithuanian-born author Marija Gimbutas [(1973). “The Beginning of the Bronze Age in Europe and the Indo-Europeans: 3500-2500 B.C.” Journal of Indo-European Studies, Volume 1, p. 163], who believed that the Indo-Europeans originated in the Russian steppes, and of Georgian linguists Gamkrelidze, T. V. and V. V. Ivanov who proposed an origin in the vicinity of the Armenian plateau [(1985). “The Migrations of Tribes Speaking Indo-European Dialects from their Original Homeland in the Near East to their Historical Habitations in Eurasia.” Journal of Indo-European Studies, Volume 13, p. 49]

D’iakonov makes an extensive survey of the linguistic and archaeological evidence and determines that the Proto-Indo-Europeans had a mixed economy based on farming and animal husbandry. He criticizes Gimbutas' theory which rests on little archaeological evidence and the completely arbitrary assumption that prehistoric populations used the horse as a military weapon. He is also critical of the Gamkrelidze/Ivanov work, both on linguistic reasons and because they postulate improbable migration routes to account for the historically attested IE languages.

D’iakonov demonstrates that the Balkan-Carpathian region has all the features known for Proto-Indo-European culture. Additionally, in a tour de force he demonstrates that the settlements of all known Indo-European languages can be accommodated easily if such a homeland is accepted, without postulating any long-range population movements except in the case of the Indo-Iranians, to whom IE languages came later.


Fig. 1: I.M. D’iakonov's Theory of Indo-European Origins [(1985). “On the Original Home of the Speakers of Indo-European.” Journal of Indo-European Studies. Volume 13, p. 92] [Click on the Picture for a larger version.]

D’iakonov [“The Paths of History,” Cambridge University Press, 1999] explained that the Indo-Europeans managed to expand because of their comparative advantage over the more primitive societies that surrounded them:

However, I would like to note at once -against the opinions of Maria Gimbutas and other authorities of the nineteenth and twentieth centuries, but in accordance with the later findings of C. Renfrew and J.P. Mallory- that the most ancient Indo-Europeans living in the fifth to third millennia BC, i.e. long before the Iron Age, although already acquainted with horse-drawn chariots, never were nomads. Their movement across Eurasia (presumably via the Balkans) was not a miltary invasion, but a slow spread, caused by a fall in the child mortality rate and, consequently, by an increase in population growth. The reason was that the population speaking the Indo-European proto-language changed to a diet of milk and meat, and had a sufficiently developed agriculture (growing barley, wheat, grapes and vegetables). The surrounding population which lived in the Early Primitive Phase, and thus was by far not so numerous (the population numbers after the change from Primitive to Primitive Communal Phase tend to multiply by two orders of magnitude), adopted the agricultural achievements of the Indo-Europeans, and at the same time also adopted their language; thus the further movements involved not only the original Indo-Europeans but also tribes who had adopted the language and the mores, the latter including the Primitive Communal stage customs which the Indo-Europeans had evolved.

Colin Renfrew

One of the most respected archaeologists of our time, Colin Renfrew [“Archaeology and Language : The Puzzle of Indo-European Origins.” ISBN: 0521386756] has argued convincingly that Indo-European languages were spread by farmers who, in search of new land gradually expanded outwards from the Fertile Crescent. He arrived at this conclusion by noting that almost all major language families were spread with farmers: didn't the farmers who colonized Europe also bring their language with them? Farmers, gradually expanding in small groups from the Fertile Crescent, and in the case of Indo-European languages from Anatolia, would profoundly alter the linguistic landscape of the lands they settled and cultivated. Renfrew's closely argued case is valuable both for providing a reasonable mechanism for the spread of Indo-European origins, and also for his thorough analysis of why other theories are wrong, or at least are supported by far flimsier evidence than they suppose.

Lord Renfrew has recently slightly modified his previous scheme. Now, he thinks that Proto-Indo-European unity is to be found in the Balkans, in agreement with the opinon of D’iakonov. Proto-Indo-European was however an offshoot of Pre-Proto-Indo-European which was the language of the early farmers who crossed the Aegean from Anatolia to settle in Thessaly. There, and in their subsequent northern expansion was formed the Proto-Indo-European community which subsequently gave birth to all the historical Indo-European languages, while those of Anatolia (Hittite, Luwian and Palaic) are actually an off-shoot of the Pre-Proto-Indo-European group that stayed behind.

According to Renfrew [“The Tarim basin, Tocharian, and Indo-European origins: a view from the west,” in V.Mair (ed.), The Bronze Age & Early Iron Age Peoples of Eastern Central Asia (Journal of Indo-European Studies Monograph #26, vol.1)]:

In harmony with the view of Dolgopolsky, and of Gamkrelidze and Ivanov, and following Sturtevant (1962), I suggest that the basic division in the early Indo-European languages is between the Anatolian languages on one hand and all the other members of the Indo-European family in the other. Such a view arises directly from the “farming dispersal” hypothesis, since farming came to Europe from Anatolia. It is suggested that all the other branches of the Indo-European languages (except possibly Armenian) were derived from the western branch of the divide (ancestral to the Indo-European languages of Europe, including those of the steppes, and thus also of the Iranian plateau, central Asia, and south Asia) [...] The secondary center, as Diakonoff realized, is the Balkans (around 5000 BCE), and from there one must envisage a division with the bulk of the early Proto-Indo-European languages of central and Western Europe (the languages of “Old Europe” in some terminologies, although emphatically not that of Gimbutas) on the one hand, and those of the steppe lands to the north of the Black Sea on the other (4th millennium BCE).

To illustrate the scheme of Colin Renfrew, I reproduce his tree of relationships of Indo-European languages:


Fig. 2: Colin Renfrew's tree of Indo-European origins [“The Tarim basin, Tocharian, and Indo-European origins: a view from the west,” in V.Mair (ed.), The Bronze Age & Early Iron Age Peoples of Eastern Central Asia (Journal of Indo-European Studies Monograph #26, vol.1)] [Click on the Picture for a larger version.]
Kalevi Wiik

More recently, Finnish scholar Kalevi Wiik has also proposed Indo-European origins in Southeast Europe. He has expounded his theory on the origins of European peoples in several journal articles and, more recently in his book “Eurooppalaisten juuret” which will be translated to English in the near future. There is also an article written by him on the Web [Europe's Oldest Language] from which the following figures are reproduced.

Wiik, uses linguistic, genetical, archaeological and anthropological data to support his theory. He believes that from 23,000-8,000 BC, Europe was divided into three main regions: Regions Ba and U were inhabited by hunters of large animals which were abundant during that period. They spoke languages related respectively to modern Basque and Finno-Ugric. Region X was inhabited by hunters of smaller animals and was fragmented into many smaller unknown languages that do not survive in modern times.



Map 1: European language distribution at the climax of the Ice Age and the following period, 23,000 to 8,000 BC (Ba = Basque, U = Uralic, X's = unknown languages)

By 5,500BC the situation had changed dramatically. The extinction of many large species of animals meant that the economic success of inhabitants of regions Ba and U declined, and they were now reduced to hunting small-game. On the other hand, the inhabitants of area X had adopted the Neolithic way of life of mixed farming and animal husbandry and were becoming economically more successful, growing in numbers. It is here, Wiik argues, in the early farmers diffusing from Greece and the Balkans that Indo-European was born, serving as a lingua franca of the inhabitants of region former X, displacing their older languages and gradually converting linguistically the less successful hunters from regions Ba and U.



Map 2: By 5,500 BC speakers of the small languages of central and southern Europe have adopted animal husbandry and the Indo-European language (Ba = Basque, IE = Indo-European, U = Uralic)

After 5,500BC this process continued. The languages of the Balkans each assumed a character of their own, because they had abosrbed earlier elements from the many small languages of region X, which persisted for some time. At the periphery of the Indo-European language expansion, the Germanic, Baltic, Slavic, Celtic and Iberian languages were formed; these were Indo-European flavored with many elements from the languages of the hunters: Basque and Finno-Ugric.


Map 3: European language distribution, 5,500-3,000 BC: the Indo-European languages have begun to spread among the hunter-fisher-gatherers of northern Europe (B = Baltic, C = Celtic, FU = Finno-Ugrian, G = Germanic, I = Iberian, IE = Indo-European, S = Slavic)

Eventually, most of Europe was Indo-Europeanized as the Basque and Finno-Ugric speaking hunters eventually adopted IE languages. Only in the periphery of the European continent, in the Iberian peninsula and in Northeast Europe were there strong nuclei of hunters which apparently adopted farming without being linguistically converted. Thus, modern Basque and Finnish speakers are descendants of mostly these early hunters of the Ice Age. Everywhere else, the Indo-European languages which originated in Southeast Europe have won the upper hand.


Map 4: European language distribution, present day (Ba = Basque, C = Celtic, FU = Finno-Ugrian, G = Germanic, R = Romance, S = Slavic)

In a more recent English-language article [Kalevi Wiik (2008) “Where Did European Men Come From” Journal of Genetic Genealogy 34(1)] Wiik surveyed Y chromosome variation in Europeans and in accordance with his earlier position stated (p. 82) that “The men of the Balkan refuge were more likely than those of any other to have spoken an early form of the Indo-European language.”

Gray and Atkinson

The theory of Indo-European origins in Southeast Europe from an earlier Anatolian source has received additional confirmation recently. Using a methodology similar to that used in evolutionary biology, Gray and Atkinson [“Language-tree divergence times support the Anatolian theory of Indo-European origin,” Nature 426, 435-439] compared 95 present and past languages of the Indo-European family based on a list of 200 basic terms for each.

The main idea of this innovative work is that languages that diverge from a common source initially tend to have similar vocabularies, but as time progresses, new terms replace older ones, and thus the intersection between the vocabularies of the languages is reduced. This principle can be used to determine the “branching pattern” of the language family, as well as to time the various splits in the tree. The authors were able to vary many parameters of the input automatically, thus taking into account the many uncertainties of this difficult problem in a systematic manner.

The results of all analyses, irrespective of the initial assumptions were very robust:

We test two theories of Indo-European origin: the 'Kurgan expansion' and the 'Anatolian farming' hypotheses. The Kurgan theory centres on possible archaeological evidence for an expansion into Europe and the Near East by Kurgan horsemen beginning in the sixth millennium BP7, 8. In contrast, the Anatolian theory claims that Indo-European languages expanded with the spread of agriculture from Anatolia around 8,000–9,500 years BP9. In striking agreement with the Anatolian hypothesis, our analysis of a matrix of 87 languages with 2,449 lexical items produced an estimated age range for the initial Indo-European divergence of between 7,800 and 9,800 years BP. These results were robust to changes in coding procedures, calibration points, rooting of the trees and priors in the bayesian analysis.

The branching pattern is also in agreement with an independent linguistic analysis of Indo-European languages [Rexova, K., Frynta, D. & Zrzavy, J. “Cladistic analysis of languages: Indo-European classification based on lexicostatistical data.” Cladistics 19, 120–127 (2003)].

The estimated times strikingly confirm the Neolithic dispersal theory, showing a divergence of Indo-European languages from Anatolian ones, with an independent branching of the mysterious Tocharian language which spread eastwards, and the descent of all other languages from what is almost certain to be a Balkan homeland:


Consensus tree and divergence-time estimates. a, Majority-rule consensus tree based on the MCMC sample of 1,000 trees; b, initial assumption set using all cognate information and most stringent constraints; c, conservative cognate coding with doubtful cognates excluded; d, all cognate sets with minimum topological constraints; e, missing data coding with minimum topological constraints and all cognate sets. Shaded bars represent the implied age ranges under the two competing theories of Indo-European origin: blue, Kurgan hypothesis; green, Anatolian farming hypothesis. The relationship between the main language groups in the consensus tree for each analysis is also shown, along with posterior probability values. [Click on the Picture for a larger version.]
Cruciani et al. (2007)
While caution should be exercized when attributing the genetic features of modern populations to prehistoric events, it is nonetheless worthwhile to try to locate a signature of expansion into the interior Europe from the Balkans consistent with the dates given by D'iakonov (5th to 3rd millennium BC). In a recent article [Cruciani et al. (2007) “Tracing past human male movements in northern/eastern Africa and western Eurasia: new clues from Y-chromosomal haplogroups E-M78 and J-M12.” Molecular Biology and Evolution 24(6): 1300-1311], such a signature was found for two Y chromosome haplogroups, E-V13 and J-M12. The authors determined that the frequency of these two haplogroups is correlated, i.e., a high frequency of one statistically predicts a high frequency in the other, suggesting a common history:
In order to evaluate whether the present distribution of these two haplogroups can be the consequence of the same expansion/dispersal microevolutionary event, we first compared the two frequency distributions in Europe (J-M12 frequencies obtained from both published and new data; supplementary table 2). We observed a high and statistically significant correspondence between the frequencies of the two haplogroups (r=0.84, 95% C.I. 0.70-0.92).
The authors further studied the features of Y chromosomes within these two haplogroups in Europe, detecting a star-like pattern.
We then constructed a microsatellite network of 43 European J-M12 chromosomes (supplementary table 3) and found a clear star-like structure (fig. 4C), a further feature shared with E-V13.
Such a pattern emerges when individual Y chromosomes, deviate over time from a single founder individual. This means that, by and large, E-V13 and J-M12 in Europe can be traced to two individual men of the past. The next step would be to determine when the expansion of their descendants took place:

By taking into consideration two different demographic expansion models (see methods), we obtained TMRCA estimates very close to those of E-V13, i.e. 4.1 ky (95% C.I. 2.8-5.4 ky) and 4.7 ky (95% C.I. 3.3-6.4 ky), respectively. Thus, the congruence between frequency distributions, shape of the networks, pair-wise haplotypic differences and coalescent estimates point to a single evolutionary event at the basis of the distribution of haplogroups E-V13 and J-M12 within Europe, a finding never appreciated before [...] Our estimated coalescence age of about 4.5 ky for haplogroups E-V13 and J-M12 in Europe (and their C.I.s) would also exclude a demographic expansion associated with the introduction of agriculture from Anatolia and would place this event at the beginning of the Balkan Bronze Age, a period that saw strong demographic changes as clearly testified from archeological records (Childe, 1957; Piggott, 1965; Kristiansen, 1998). The arrangement of E-V13 (fig. 2D) and J-M12 (not shown) frequency surfaces appears to fit the expectations for a range expansion in an already populated territory (Klopfstein, Currat and Excoffier 2006).

Thus, Y chromosome haplogroups E-V13 and J-M12 in Europe seem to have expanded from the south Balkans at the beginning of the Balkan Bronze Age. Moreover, similarly to the results reported by Pericic et al. (2005) for E-M78 network α, the dispersion of E-V13 and J-M12 haplogroups seems to have mainly followed the river waterways connecting the southern Balkans to north-central Europe, a route that had already hastened by a factor 4-6 the spread of the Neolithic to the rest of the continent (Tringham, 2000; Davison et al. 2006).
The authors do not attempt to relate this expansion to a linguistic spread, but its timing is consistent with the model of the Indo-Europeanization of Europe from a proximate Balkan source at the Early Bronze Age.