Log in

View Full Version : Mapping Admixtures



safinator
02-14-2014, 05:44 PM
Quite interesting, unfortunately not many populations.


http://dienekes.blogspot.com.au/2014/02/human-admixture-common-in-human-history.html

A string of recent papers argued for admixture in human populations at time scales from the Middle Pleistocene to recent centuries. A new paper in Science makes the point convincingly for extensive admixture in humans over the last few thousand years. The authors include the creators of Chromopainter/fineStructure software; the new "Globetrotter" method appears to be a natural extension of that method that seemed to work wonderfully well except for the limitation of producing only a tree of the studied populations.

The paper has a companion website in which you can look up the admixture history of individual populations.

While reading this study, it is important to remember its limitations. Two are immediately obvious: (i) admixture events can only be detected for the last few thousand years, as this method depends on pattern of linkage disequilibrium which decays exponentially with time due to recombination, and (ii) detection of admixture seems to depend on the presence of maximally differentiated populations from the edges of the human geographical range; for example, the Japanese appear unadmixed even though they are clearly of dual Jomon/Yayoi ancestry. On the other hand, the method does detect the admixture present in the San at a similar time scale.

The case of Northwestern Europe appears especially striking as none of the populations from the region show evidence of admixture. This may be because the mixtures taking place there (e.g., between "Celts" and "Anglo-Saxons" in Great Britain) involved populations that were not strongly differentiated. Alternatively, population admixture history may have preceded the last few thousand years and is thus beyond the temporal scope of this method.

An exception to the rule that populations at the edges of the human range appear to be unadmixed are the Armenians who appear to be the only * between the Atlantic and Pacific in Figure 2D (shown at the beginning of this post). The companion site lists their status as "uncertain".

Other results are more questionable; for example, the authors assert that Sardinians are an admixed population with one side being "Egyptian-like" and the other "French-like" whereas the ancient DNA evidence as it stands would rather indicate that Sardinians are the best approximation of Neolithic Europeans currently in existence and so are more likely to (mostly) possess a gene pool that traces back to ~8-9 thousand years in Europe. It will be quite the surprise if so many Europeans from 5kya or earlier look like modern Sardinians and ancient Sardinians don't!


The analysis of Eastern Europe is particularly interesting as it documents three way admixture (Northern/Southern/NE Asian) in most populations but two way admixture (Northern/Southern) in Greeks, estimated at ~37%. The authors claim that this is related to the Slavs, which seems reasonable given the 1,054AD age estimate. On the other hand, according to the companion website, the southern element in Greeks is inferred to be Cypriot-like and it's far from clear that the pre-Slavic population of Greece was Cypriot-like or indeed represented by any of the populations in the authors' dataset.

The three-way admixture in much of eastern Europe is not particularly surprising as history furnishes ample evidence for groups of steppe origin in the region during historical times. Some bequeathed their both language and name (e.g., Magyars), others only their name (e.g., Bulgarians) on the local Europeans, but records indicate a widespread presence of "eastern" groups in Europe from the time of the Huns to that of the Ottomans. A study of late Antique eastern Europeans from the Baltic to the Aegean may help better document how the twin phenomena of the eastern invasions and the spread of the Slavs shaped the present-day genetic diversity of the region.

I suspect that a few ancient samples will be far more informative for understanding the recent history of our species than the most sophisticated modeling of modern populations. Nonetheless, it's great to have a new method that maximizes what can be learned about the past from the messy palimpsest of the present.

Science 14 February 2014: Vol. 343 no. 6172 pp. 747-751 DOI: 10.1126/science.1243518

A Genetic Atlas of Human Admixture History

Garrett Hellenthal et al.

Modern genetic data combined with appropriate statistical methods have the potential to contribute substantially to our understanding of human history. We have developed an approach that exploits the genomic structure of admixed populations to date and characterize historical mixture events at fine scales. We used this to produce an atlas of worldwide human admixture history, constructed by using genetic data alone and encompassing over 100 events occurring over the past 4000 years. We identified events whose dates and participants suggest they describe genetic impacts of the Mongol empire, Arab slave trade, Bantu expansion, first millennium CE migrations in Eastern Europe, and European colonialism, as well as unrecorded events, revealing admixture to be an almost universal force shaping human populations.


http://admixturemap.paintmychromosomes.com/

Graham
02-14-2014, 05:54 PM
You are all Mutts. :P

http://4.bp.blogspot.com/-70VKY4lpEgU/Uv0YObtxyoI/AAAAAAAAJfc/peOY5nhRzek/s1600/globetrotter.png

Argang
02-16-2014, 11:28 PM
Daniel Falush, one of the authors of this study, posted this in Polako's blog as a response for some criticism. Namely, he explains the oddball results (for Orcadians, Finns and Lithuanians).



Firstly, we do not say that there is no admixture in the Orcadians, only that it does not (though there is some evidence) give a significant signal based on the current dataset on based on the criteria for significance that we made based on simulations. The interactive map can be used to see the best guess and then it makes it a mix of welsh vs Norwegian. There will be more on this population when the Peopling of the British Isles paper comes out.

Secondly, you highlight the Lithuanian results in the "full analysis"and the Hazda contribution. It is worth going through these in a bit of detail as it will help in interpreting our results, for this and other groups.

As well as the full analysis, we have also done other analyses where the other East European populations are not included as donors (EastEuropeI analysis; East EuropeII analysis). We did this because we found fineSTRUCTURE - which we ran initially to understand overall STRUCTURE patterns - had particular trouble splitting populations from Eastern Europe (Figure S17 in the supplement) and because we found - based on our simulation results - and really common sense too - that including very similar groups in the analysis could mask events they share (a caveat we included in the main text). Thus the "EuropeI" analyses ought to have more power to find historical events in the region because they do not mask admixture that is common to all East European populations. (In any case, we believe it is therefore worth checking whether groups seen in the Full analysis are verified in these analyses too.)

The total admixture signal in the "full analysis" for the first Lithuanian event is 1% of the genome, with admixing source Daur +Oroquen+Columbian+Shi. Lithuanians are generally very similar to the other East European populations and these are assigned 99% of the ancestry. This is by far the strongest signal in this analysis, and we think reflects a real event, but because the admixture fraction is potentially tiny, source inference might still be quite inexact, and "masking" may be an issue.


However, despite the very small proportion of incoming DNA, the analysis also suggests that the admixture is multi-way. The p-value for this is 0.02 - which is a marginal signal, though in the EuropeI analysis the p-value is more convincing. In the paper we write that our source inference is less good for complex events (like this one) - in fact as explained below for this reason we do not provide direct source inference for secondary events - and so this second signal will be one of the toughest in the dataset for our approach (because it is weaker than a quite weak signal!).

It is worth mentioning what the squares relating to the event mean here, because they're different to the circles shown for the stronger events we find! They're called "contrasts" on the webpage and if you mouse over it says "Square size cannot therefore be directly interpreted as similarity to (unsampled) admixing sources, but rather highlights differences between sources." To be a bit clearer, the second component shows the different directions of the minority sources of the genome; technically, differences in populations' inferred contributions of haplotypes in the mixture representation. This means that they do NOT correspond to admixture proportions (which we do not try to estimate) but rather to populations whose ancestry is most strongly differentiated in the different components of admixture in Lithuanians - because our method is currently not able to infer sources fully for these second events. This is discussed briefly in the main text under things we find challenging. In this case squares show broadly speaking South versus North. North is associated with the first admixture component. Hadza is one of the South components, so what we're saying here is that a haplotype shared with Hadza is very strongly indicative of coming from the "South" group - not that Hadza is necessarily a dominant member of this group. Nevertheless, Hadza admittedly is a pretty odd choice - and not seen in the EastEurope I version where the signal is much stronger - and this may have something to do with the weakness of the signal. There are only 3 Hadza in the dataset and they may have ended up on the extreme end of the PCA decomposition by chance. In any second event where squares are shown, they only relate to these differences between sources.

Although the three-way admixture here is on the border of statistical significance and quite hard to interpret, the fact that there was in fact a three-way admixture event shared by the populations in the region is strongly supported by the EastAsiaI analysis. These are the results we present in the figure in the paper.


Another highlighted case is the Finnish - we only have 2 Finns which is difficult to say much from (and we could have removed cases like this - but chose not to for completeness). Probably as a consequence, their signal is one of the weakest in the whole dataset, in terms of the "fit" of the best curve to the data (0.415). Again, the Finnish second signal seems slightly odd (and includes an African group) - I think most of the same reasons might apply there (though perhaps more strongly) as for the Lithuanians. We're working on the methods for these "complex" events but likely they will remain tricky until we can obtain access to more local samples.

(with contributions from Simon Myers).