Kategoriarkiv: Ikke kategoriseret

The Indo-European homeland – ancient DNA (part 2)

[Edit 29 October 2021: Due to accessibility requirements I’ve had to remove the colours; sorry about that. – Thomas]

Mikkel Nørtoft

This blogpost is the second and last part about the question of the Indo-European homeland from the perspective of ancient DNA. For the first part – see here, and for an introduction to the traditional methods (before aDNA) of locating the homeland, see here

The large share of Caucasian ancestry in the steppes has lead some geneticists to suggest that early Proto-Indo-European (PIE, before Anatolian split off) was perhaps in the South Caucasus until around 5000-4000 BCE, thus splitting west into Anatolia (Anatolian split) while the rest (“core PIE”) went north into the steppes and from there spread east and west with Yamnaya populations around 3000 BCE.

The South Caucasian Homeland model

The South Caucasian Homeland model (drawn on a map from Anthony 2007)

While this is not impossible, this South Caucasian PIE homeland model would ignore the proposed close connection between PIE and Uralic (see the blog post about linguistic/archaeological homeland arguments), and the fact that the whole Caucasus region has been occupied by non-Indo-European isolated languages and language families (North Caucasian, Kartvelian, Hurro-Urartian, and Sumerian to the south) for millennia. This leaves quite limited room for a successful language family such as Indo-European to spread and dominate.

The lack of securely reconstructed agricultural terminology is also a problem since the Caucasus region did have agriculture quite early compared to the steppe. Furthermore, a slow 2000 year increase of Caucasian DNA in the steppes, is not a very overwhelming population turnover, and does not speak very much for a language shift. Instead, a hypothesis of Caucasian substrate (“under layer”) influence on the “sound” of PIE could perhaps fit better with a long history of Caucasian females learning to speak PIE in their Steppe husband’s family, influencing their children’s PIE with their own “PIE with Caucasian accent”. This scenario could further be supported by Majkop-like weaving tools (loom weights) and pots appearing in the presumably steppe-related Repin culture sites on the Lower Don river (admitting the assumption that weaving and pottery were primarily done by females). We do, however, also see other Majkop and Novosbodnaya-like tools here such as copper chisels and projectile points which could be more male-oriented.

Cultural and material continuity in the steppes from earlier hunter-gatherer to herder periods has also been argued for, although with different periods of trade links and influence from both the Neolithic Balkan groups and the Neolithic Caucasian groups from 5000-3000 BCE. Furthermore, the Steppe groups still retain a good part of their local Eastern Hunter-Gatherer (EHG) ancestry to match this scenario, along with the EHG-related Y-haplogroup R1a, a subgroup of which is spread with steppe-related groups.

So far it seems that when combining linguistics, archaeology and ancient DNA, we can date “core PIE” or “Indo-Tocharian”, depending on which term you like to use (after Anatolian split but before Tocharian split) to around 3400-3000 BCE. When we include Anatolian in PIE (a.k.a. “early PIE“ or “Indo-Hittite”), we might go back to around 4500-4000 BCE. In between, we have a “hiatus” where we do not see any clear expansion from the steppe with the data we have now.

Two very important discoveries can also be added from the many genomic papers that have come out since 2015: 

  1. It seems these Yamnaya people brought an early form of the pneumonic plague (Yersinia pestis) with them as it is found in individuals in the target areas (Central Europe in the west and Altai in the east) of the migrations after 3000 BCE, and also in the North Caucasus around 2800 BCE. We could therefore speculate that one of the important factors in the success of Yamnaya DNA and presumably Indo-European language spread was that some Yamnaya were perhaps more resistant to this plague, and that the locals were not. This scenario would resemble the Europeans coming to America and wiping out the natives with their diseases (and “guns and steel”), which they themselves were immune to.
  2. The genetic mutation for lactose tolerance that allows adult humans to digest raw milk, especially widespread in modern North Europeans, was not present in Neolithic Europeans before 3000 BCE. It seems to have appeared first in the Pontic-Caspian steppes in a few individuals (first around 4000 BCE in Ukraine), and after they migrated to Central Europe c. 3000 BCE, lactose tolerance very slowly became more widespread over the next c. 1500 years in Europe.


Allentoft, Morten E., et al. 2015. Population genomics of Bronze Age Eurasia. Nature 522, pp. 167–172.
Anthony, David W. 2007. The horse, the wheel and language: how Bronze-Age riders from the Eurasian Steppes shaped the modern world. Princeton, NJ: Princeton University Press.
Bomhard, Allan 2017. The Origins of Proto-Indo-European: The Caucasian Substrate Hypothesis (revised October 2017) (publication unknown). available here
Diamond, Jared 1997. Guns, Germs, and Steel: The Fates of Human Societies. US: W.W. Norton & Company
Fu, Qiaomei, et al. The genetic history of Ice Age Europe. Nature 534, pp. 200–205.
Haak, Wolfgang, et al. 2015. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, pp. 207–211.
Jones, Eppie R. et al. 2015. Upper Palaeolithic genomes reveal deep roots of modern Eurasians. Nature Communications 6(8912).
Lazaridis, Iosef et al. 2016. Genomic insights into the origin of farming in the ancient Near East. Nature 536, pp. 419–424.
Mathieson, Iain, et al. 2015 Genome-wide patterns of selection in 230 ancient Eurasians. Nature 528, pp. 499–503.
Mittnik, Alissa, et al. 2018. The genetic prehistory of the Baltic Sea region. Nature Communications 9 (1).
Olalde, Iñigo, et al. 2018.  The Beaker phenomenon and the genomic transformation of northwest Europe. Nature 555, pp. 190–196.
Rasmussen, Simon, et al. 2015. Early Divergent Strains of Yersinia pestis in Eurasia 5,000 Years Ago. Cell 163, pp. 571–582.
Reich, David 2018. Who We Are and How We Got Here: Ancient DNA and the New Science of the Human Past. Pantheon Books and Oxford University Press.
Valtueña, Aida Andrades, et al. 2017. The Stone Age Plague and Its Persistence in Eurasia. Current Biology 27(23).
Chuan-Chao, et al. 2018. The genetic prehistory of the Greater Caucasus (preprint). BioRXiv 16 May.

The Indo-European homeland – ancient DNA (part 1)

[Edit 29 October 2021: Due to accessibility requirements I’ve had to remove the colours; sorry about that. – Thomas]

Mikkel Nørtoft

In this post, we will look into some new findings from the field of ancient DNA which are relevant to the question of the Indo-European homeland. In this first part, we will mostly focus on the genetic aspect, and in the second part (soon to come), we will look at how this can best be correlated with linguistic arguments.

The field of ancient genomics has contributed greatly to the discussion of prehistoric migrations and with that also the discussion of the Indo-European homeland.

Presently, it seems that western Eurasia displays a small number of basal ancestral groups from the earliest sampled individuals going back to the Palaeolithic (colours match the colours on the Homeland Timeline Map and will be used throughout the blog post for readers to better follow which group I am referring to):

– Eastern Hunter-Gatherers (EHG) found in most of Russia and Eastern Europe (including the Pontic-Caspian steppes).
– Western Hunter-Gatherers (WHG) found all over the European peninsula going back to the Paleolithic.
– Caucasian Hunter-Gatherers (CHG) and Mesolithic and Neolithic Iranians which seem to be closely related.
Anatolian Farmers responsible for the spread of agriculture in most of Europe.
– Levant Farmers going back to the Natufians who are known as the earliest farmers in the Levant.
– The colour RED will be used here for the Pontic-Caspian “Steppe profile” which is a mix of CHG and EHG ancestry. Archaeological cultures related to this Steppe-profile will also be shown with the color RED:

Approximate distribution of basal ancestry groups in Western Eurasia 10,000 years ago

Approximate distribution of basal ancestry groups in Western Eurasia about 10,000 years ago and the later Steppe profile added in red (by Mikkel Nørtoft).

Migrations from the homeland

In 2015, two large (and competing) studies[1] appeared with improved methods looking at the whole genome instead of the earlier methods only using the Y-chromosome (inherited from the father, only in males) or mitochondrial genome (inherited from the mother). Together, they had sampled more than a hundred ancient humans from various periods and regions in western Eurasia.

They reached the same conclusion using two different methods of analysing ancient DNA: the individuals of the archaeological culture termed Yamnaya (or “Pit Grave”) (c. 3300-2800 BCE), and the preceding Khvalynsk culture (5th millennium BCE), both living in the Pontic-Caspian steppes, were genetically closely related to the widespread European Corded Ware culture complex (c. 2900-2200 BCE). They went as far as suggesting a “mass migration” of Yamnaya herders into Europe around 3000-2500 BCE. The Yamnaya genomes were also very close to those of the Afanasievo culture appearing in the Altai region of Siberia around 3300-3100 BCE. The male Y-chromosome haplogroups R1a and R1b spread together with these steppe-populations into Europe and Asia and are still today very frequent in most of Europe. It thus seemed that an exodus of herders moving both east and west from the Pontic-Caspian steppes had now been found in the DNA.

The genetic make-up of Yamnaya

Yamnaya individuals in the Pontic-Caspian steppes also derived about half of their ancestry from earlier local Eastern Hunter-Gatherers (EHG), and about half from the Caucasus region (Caucasian Hunter-Gatherers). This Caucasian ancestry is about 25% in earlier steppe Khvalynsk individuals (5th millennium BCE).[2]

This seems to fit archaeological studies that show movement of material culture from the North Caucasian Maykop and Novosvobodnaya cultures into the steppes during the 4th millennium BCE.[3] The cultural-material evidence of influence from the Caucasus before the 4th millennium BCE is more subtle. It has now been confirmed that the two ancestry groups do not seem to share deep genetic ancestry. Only from the Yamnaya period (3rd millennium BCE), we see the significant steppe ancestry moving south into the Caucasus region.[4]

However, there are a few so-called “outlier” individuals on both sides of this frontier who belong to the other genetic group[5]. This scenario fits the model of cultural contacts between the two groups through small scale movement. Y-chromosome haplogroup J2 (carried by males) is very common in the Caucasian group. Since this haplogroup only rarely shows up in steppe-related individuals, also after the spread of steppe groups, it could indicate that mostly females “switched sides” in a system of patrilocal intermarriage (females move to the husband’s family), perhaps through marriage alliances, or at least that Caucasian male lineages were not very succesful in the steppes. We also see these moving “female brides” in Europe with the arrival of (mostly male) Yamnaya populations forming the Corded Ware phenomenon. This model is further supported by the Caucasian ancestry in the steppes only increasing quite slowly, first to about 25% in the Khvalynsk period (5th millennium BCE), and then to about 50% in the Yamnaya period (late 4th millennium BCE).

Additionally, it should be noted that some minor ancestry in Yamnaya individuals from European Farmers and Western Hunter-Gatherers has also been found recently. This supports archaeological contacts between steppe societies and European farming societies like the Cucuteni-Tripolye town dwellers in Ukraine and Bulgaria and/or the East European Globular Amphora culture where this type of ancestry has been found mixed with European Western-Hunter Gatherer ancestry[4].

In the next post, we will look into the implications of these genetic findings on the question of the Indo-European homeland.


[1] Haak et al. 2015; Allentoft et al. 2015
[2] Allentoft et al 2015
[3] Anthony 2007
[4] Wang et al. 2018 (preprint)
[5] Lazaridis et al. 2016; Jones et al. 2015; Fu et al. 2016; Wang et al. 2018


Allentoft, M. E. et al. 2015. Population genomics of Bronze Age Eurasia. Nature 522, 167–172.
Anthony, David W. 2007. The horse, the wheel and language. Princeton, NJ: Princeton University Press.
Fu, Q. et al. The genetic history of Ice Age Europe. Nature 534, 200–205.
Haak, W. et al. 2015. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211.
Jones, E. R. et al. 2015. Upper Palaeolithic genomes reveal deep roots of modern Eurasians. Nat. Commun. 6, 8912 (2015).
Lazaridis, I. et al. 2016. Genomic insights into the origin of farming in the ancient Near East. Nature 536, 419–424.
Wang, C.-C. et al. 2018. The genetic prehistory of the Greater Caucasus (preprint). BioRXiv 16 May.

Where do the Indo-European languages come from?

Thomas Olander

Around half the world’s population today speaks a language belonging to the Indo-European language family. The Indo-European language family includes most languages spoken in an area covering Europe (important exceptions here being Basque, Finnish and Hungarian), Iran, Afghanistan, Pakistan and northern India (see fig. 1). Some of the most widely spoken Indo-European languages are English, German, Spanish, Portuguese, French, Russian, Hindi, Bengali and Punjabi.

Present-day distribution of Indo-European languages

Fig. 1. Present-day distribution of Indo-European languages. Orange: countries with a majority of speakers of IE languages. Yellow: countries with an IE minority language. (Brianski [Public domain], from Wikimedia Commons)

The Indo-European languages seem to be newcomers in most of Europe and East Asia. But where do they come from? That is a good question that may be answered in several different ways. As an introduction to this blog, I will present a short answer in this post.

A simple answer to the question of the origin of the Indo-European languages is “Africa”. It is likely that the first anatomically modern humans, who lived in Africa more than 200,000 years ago, spoke with each other in the same manner as humans now speak with each other, and quite possibly the languages we now speak descend from this speech of the first humans. Thus, in this sense, all languages, including the Indo-European languages, originate in Africa. However, apart from assuming that it was probably functionally similar to modern language, we do not know much about the language of the first humans. Too much time has passed since then for the methods of historical linguistics to be able to posit any specific hypotheses about it.

An alternative and, in my opinion, more interesting way to answer the question of the origin of the Indo-European languages is to investigate where, and when, the ancestor of the Indo-European languages was spoken. To illustrate this approach we may take a modern Indo-European language – English, for instance – and trace its development back in time through history, first Middle English and then Old English. By comparing the oldest documented stages of English with those of the other Germanic languages – such as German, Dutch, the Nordic languages and the extinct Gothic language – we arrive at Proto-Germanic, the ancestor of all the Germanic languages. Proto-Germanic is usually estimated to have been spoken around the beginning of our era.

We don’t have to stop there, though. By comparing Germanic with the other subgroups of the Indo-European language family, historical linguists are able to reconstruct large parts of the sound, grammar and vocabulary of the ancestor of all Indo-European languages: Proto-Indo-European.

So all Indo-European languages descend from a hypothetical proto-language, Proto-Indo-European. But where was Proto-Indo-European spoken?

Put this way, the answer must be found in a collaboration between linguistics and archaeology. Ever since it was discovered, two centuries ago, that the Indo-European languages are related and descend from a common ancestor, scholars and lay people have discussed where on earth the “Indo-European homeland” was located. The guesses are numerous and of varying quality. Fig. 2 is a heatmap showing some of the proposals of the location of the Indo-European homeland from 1813 to 2018.

Proposed locations of the Indo-European homeland

Fig. 2. Some of the proposed locations of the Indo-European homeland. (Thomas Olander)

Today most historical linguists and most archaeologists interested in the problem are inclined to think that the most likely location of the Indo-European homeland is in the steppe north of the Black Sea and the Caspian Sea, in present-day Ukraine and south Russia – the Pontic–Caspian steppe.

There are several reasons why the “steppe hypothesis” is the most attractive one. Archaeological cultures from the steppe have spread westwards into Europe and eastwards towards India and western China in a period that fits our knowledge of the chronology of the Indo-European languages. Certain words that are reconstructible for early stages of Indo-European languages – primarily two words for ‘wheel’ and a word for ‘axle’ – indicate that the spread of the Indo-European languages cannot have taken place much earlier than the invention of the wheel around 4000–3500 BCE.

The structure of the relationship between the subgroups of the Indo-European language family – the Indo-European family tree – fits a a spread from the Pontic–Caspian steppe much better than the alternatives, most famously the Anatolian hypothesis, which locates the Indo-European homeland in central Turkey around 6500 BCE.

Until recently an important argument against the steppe hypothesis was that it was difficult to imagine how a language spoken by people on the Pontic–Caspian steppe could have spread as dramatically as the the spread of early Indo-European speech must have been. The most likely vector for the spread would have been movement of people; but archaeology doesn’t show unambiguously that such migrations had taken place.

In recent years, however, the steppe hypothesis has received support from a somewhat unexpected side: prehistoric genetics. Analyses of ancient DNA from skeletons found in Europe and Asia show that there were large-scale migrations of people – especially male individuals – from the steppe into Europe and towards Asia during the third millennium BCE. With the studies of ancient DNA, published by population geneticists from different research environments but pointing in the same direction, the main argument against the steppe hypothesis was dismantled.

The evidence thus seems to support an Indo-European homeland in the Ukrainian and south Russian steppe region north of the Black Sea and the Caspian Sea – but the question has not been definitively settled yet.