Log in

View Full Version : Analysis of Eurogenes global Mclust run with mesolithic Iberian La Braņa and ancient Siberian MA-1



Argang
03-15-2014, 07:35 PM
This is Davidski's first big clustering run of all world populations that also includes these ancient genomes. It is based on similar affinities in terms of global genetic variation. What follows is an explanation of clusters, some results are quite surprising.

https://docs.google.com/spreadsheet/ccc?key=0Ato3EYTdM8lQdHJNV2FZRWh2c1V4bWxGQlhMX0Nmb mc&richtext=true#gid=0


1 & 2 = Caucasus and Iranian clusters, Georgians, Abkhazians, Adygei, Chechens, Ossetians, Iranians, Turks and Kurds show membership and there is a lot of overlap

3 = North Africans, Algerians, Tunisians, Moroccans, Mozabites, Egyptians.

4 = Some kind of Middle Eastern cluster

5 = Pan med/Levantine cluster with Armenians/Assyrians, Cypriots, Ashkenazi and Sephardic Jews, some South Italians and Sicilians, even Murcians and Canarians

6 = Interesting Basque-centered cluster that also has Iberians from some Spanish regions, French, North Italians and even English, Swiss and Bulgarian individuals.

7 = Pan-European cluster with members from British Isles, France, Germany, Austria, Scandinavia, Croatia, Hungary, Ukraine, even Belorussia

8 = East Balkan cluster with Bulgarians and Romanians. The Serbs are an even split of this and cluster 6.

9 = South-Central Asian cluster. Brahui, Kalash, Pashtun, Makrani, Punjabi, Balochi.

10 = SSA Bantu/Yoruba cluster

11 = East African Hazda and Gumuz cluster

12 = Sardinian, South Italian, Central Greek, Tuscan, Iberian (including Portuguese) and Abruzzo cluster with a few North Italians.

13 = Arabic Bedouin cluster

14 = Estonian viking facade :cool: British Islanders, Lithuanians, Estonians, Southwest Finns, Scandinavians and East Slavs (Belorussians and Western Russians, some Ukrainians). Poles are split between this and 7.

15 = Pygmies

16 = Indian cluster (Dharkar, Velamas, Hakkipikki)

17 = Another South Asian cluster (Brahmins, Sindhi, interestingly Burushos too)

18 = Siberian cluster

19 = Some kind of East Asian cluster with Malays, Daur, Hezhen, Oroqen, Cambodian.

20 = Han Chinese/Korean cluster. Japanese overlap with 19.

21 = Volga-Uralic cluster with Chuvash, Tatar, Mari, Mal'ta boy (ANE). La Braņa is more western than these (but more eastern than North Russians, Finns etc) but is pulled fully here anyway, showing clear eastern influence in Mesolithic European hunter-gatherers. All individuals from cluster 23 and many from cluster 14 are as close to the mesolithic hunter-gatherer as most in this cluster. East Asian and Siberian clusters on the other hand are not close to Mal'ta or La Braņa, even Amerindians are closer.

22 = Amerindian cluster

23 = Northeast European cluster with a few Estonians, Finns, North Russians, Mordovians

24 = Northeast African Horners

25 = Another East African cluster with Maasai

26 = Oceanian cluster with Papuans, Melanesians

Kale
03-16-2014, 02:06 AM
Weird calculator...why would Askenazi (a known admixed population) show unanimously one cluster? Then why would La-Brana be in cluster 21 with Chuvash and Mari, while the modern Spanish literally have 0 of that?

Argang
03-16-2014, 08:41 AM
Weird calculator...why would Askenazi (a known admixed population) show unanimously one cluster? Then why would La-Brana be in cluster 21 with Chuvash and Mari, while the modern Spanish literally have 0 of that?

All modern european populations are admixed more or less compared to mesolithic or paleolithic ones. These clusters are based on total genomewide similarity, which means Ashkenazi cluster with populations that are genetically most similar to what they are at present. In PCA's they tend to overlap with Sicilians and Greek islanders too, so this makes sense.

Mesolithic hunter-gatherers were so far extreme northeast european-like even in Western Europe, while mixing with more southern neolithic and later populations has pulled modern Europeans away from them to a degree. That's the reason for their clustering, and for why La Braņa does not cluster with modern Iberians.

Here's genomewide similarity distances to La Braņa for individuals from some populations:


Smaller distance means closer.

Komi Komi1 0.008346637

Finnish HG00273 0.009488728
Finnish HG00274 0.009442226
Finnish HG00276 0.009164268

Lithuanian lithuania1 0.009928347
Lithuanian lithuania2 0.010546408
Lithuanian lithuania3 0.01027119

Norwegian NO1 0.010738471
Norwegian NO2 0.009897513
Norwegian NO5 0.010528696

Irish IE8 0.010605525
Irish IE12 0.010991494
Irish IE13 0.010899646

German DE1 0.010526041
German DE6 0.0111073
German DE7 0.011131318

Kalash HGDP00267 0.011155041
Kalash HGDP00274 0.011189527
Kalash HGDP00279 0.010659987


Spanish_Pais_Vasco HG01515 0.011513906
Spanish_Pais_Vasco HG01516 0.011798699
Spanish_Pais_Vasco HG01518 0.011486488
(Basques)

Bulgarian Bulgaria1 0.011642246
Bulgarian Bulgaria2 0.011798981
Bulgarian Bulgaria3 0.011803541

Spanish_Castilla_Y_Leon HG01612 0.012500811
Spanish_Castilla_Y_Leon HG01783 0.012283846
Spanish_Castilla_Y_Leon HG01784 0.012039366

Armenian arm3 0.013235926
Armenian arm4 0.012771953
Armenian arm5 0.013386527

Algerian NAFR1 0.017293782
Algerian NAFR2 0.016926357
Algerian NAFR4 0.01614949

Karitiana HGDP01003 0.021329083
Karitiana HGDP01006 0.02155314
Karitiana HGDP01009 0.021666435

Papuan HGDP00540 0.022977503
Papuan HGDP00541 0.02299784
Papuan HGDP00542 0.023303734

Mongola HGDP01223 0.024771874
Mongola HGDP01224 0.025730492
Mongola HGDP01225 0.026241254

North_Han_Chinese HGDP00774 0.027533946
North_Han_Chinese HGDP00775 0.027625297
North_Han_Chinese HGDP00776 0.027082176

Bantu_N.E. HGDP01405 0.039545916
Bantu_N.E. HGDP01406 0.039325226
Bantu_N.E. HGDP01408 0.038941807

Atlantic Islander
03-16-2014, 09:01 AM
I'm in 12 with a little 6 & 8 noise.

Argang
03-16-2014, 09:18 AM
I'm in 12 with a little 6 & 8 noise.

It looks like that all noise in >99% members of cluster 12 is from 5,6 or 8.

Harkonnen
03-16-2014, 10:31 AM
Smaller distance means closer.

Komi Komi1 0.008346637

Finnish HG00273 0.009488728
Finnish HG00274 0.009442226
Finnish HG00276 0.009164268

Lithuanian lithuania1 0.009928347
Lithuanian lithuania2 0.010546408
Lithuanian lithuania3 0.01027119

Norwegian NO1 0.010738471
Norwegian NO2 0.009897513
Norwegian NO5 0.010528696

Irish IE8 0.010605525
Irish IE12 0.010991494
Irish IE13 0.010899646

German DE1 0.010526041
German DE6 0.0111073
German DE7 0.011131318

Kalash HGDP00267 0.011155041
Kalash HGDP00274 0.011189527
Kalash HGDP00279 0.010659987


Spanish_Pais_Vasco HG01515 0.011513906
Spanish_Pais_Vasco HG01516 0.011798699
Spanish_Pais_Vasco HG01518 0.011486488
(Basques)

Bulgarian Bulgaria1 0.011642246
Bulgarian Bulgaria2 0.011798981
Bulgarian Bulgaria3 0.011803541

Spanish_Castilla_Y_Leon HG01612 0.012500811
Spanish_Castilla_Y_Leon HG01783 0.012283846
Spanish_Castilla_Y_Leon HG01784 0.012039366

Armenian arm3 0.013235926
Armenian arm4 0.012771953
Armenian arm5 0.013386527

Algerian NAFR1 0.017293782
Algerian NAFR2 0.016926357
Algerian NAFR4 0.01614949

Karitiana HGDP01003 0.021329083
Karitiana HGDP01006 0.02155314
Karitiana HGDP01009 0.021666435

Papuan HGDP00540 0.022977503
Papuan HGDP00541 0.02299784
Papuan HGDP00542 0.023303734

Mongola HGDP01223 0.024771874
Mongola HGDP01224 0.025730492
Mongola HGDP01225 0.026241254

North_Han_Chinese HGDP00774 0.027533946
North_Han_Chinese HGDP00775 0.027625297
North_Han_Chinese HGDP00776 0.027082176

Bantu_N.E. HGDP01405 0.039545916
Bantu_N.E. HGDP01406 0.039325226
Bantu_N.E. HGDP01408 0.038941807



Rather peculiar that Papuans seem genowidely so similar to Mal'ta, almost equal to Karitiana considering that according to this

http://img811.imageshack.us/img811/4151/zubo.png

They seemed rather distant to it.

----

Eurogenes update


https://docs.google.com/spreadsheet/ccc?key=0Ato3EYTdM8lQdEYxTnFCZTJ5UWhabHhHT3c0UkxCZ VE#gid=0
https://docs.google.com/spreadsheet/ccc?key=0Ato3EYTdM8lQdC1qR09nNHU2V2hWNjZLR2FyOUt2b VE#gid=0

So here Karitiana score more on the ENA component than Kets..

I think one thing here could be that there really should be 2 ENA components.

Damião de Gķis
03-16-2014, 02:15 PM
I think you mistook cluster 7 for cluster 8 for some North Italians.

Argang
03-16-2014, 02:19 PM
Rather peculiar that Papuans seem genowidely so similar to Mal'ta, almost equal to Karitiana considering that according to this



IBS, ADMIXTURE, and shared drift all measure a bit different things. IBS measures genomewide similarity, Admixture represents pre-selected parts of the genome as a combination of pre-selected reference groups and shared drift measures drift (duh). That's why the results differ, even though MA-1 is overall more similar to europeans than amerindians it has drifted in a way shared with Amerindians the most.

ENA reference is not "ancient pure ENA", but a modern population which, as I've mentioned previously, contains some ANE, which looks to be marked in the sheet too. This elevates ENA scores in populations with ANE in a test that uses Han as ENA reference. Komi had the most genomewide similarity to both of those ancient samples, but in an ADMIXTURE test - a more limited comparison which has the aforementioned reference issue - this doesn't show. Kalash (who were as similar to both ancient genomes as Central Europeans) in that Admixture test turn out less "Mammoth steppe" than Sardinians - regardless of whether La Braņa or MA-1 is used as reference.

The more limited nature of admixture tests and the reference issue also causes La Braņa to turn out as 95.3% "Mammoth" (Mal'ta as reference) + 4.7% ENA and Mal'ta as 89% "Mammoth" (La Braņa as reference) + 11% ENA.

Argang
03-16-2014, 02:23 PM
I think you mistook cluster 7 for cluster 8 for some North Italians.

Yeah you're right. Fixed it.

Harkonnen
03-16-2014, 05:04 PM
It has to start somewhere. So basically to put it simply (ignoring the role of farmers in all of this for the most part) at point blank we have the proto-eurasians. At some point, early on ancient ENA gets anxious and shoots east. Here we have the first split, and the first two gene bases start to develop A-ENA and the western posse. At some point ANE gets unhappy with WHG, high five.. see ya later, and goes after A-ENA. It then starts to drift away from WHG, but still maintaining the same younger gene base with WHG contra A-ENA whose drifting ways started way earlier. Reason why Amerindians share the most drift with Mal'ta is because they have the most actual ancestry from the boy*, they are the descendants of A-ENA & ANE. Mal'ta then dissolves into a minority component into many populations, whereas WHG remains a rather significant component in many parts of Europe.

*and this likely has to be so whatever the admixture tests might say.

Sikeliot
03-16-2014, 05:30 PM
Sicilians and southern Italians are split between a Levantine cluster and a Euro Med one, interesting. I wonder if that depends on where in southern Italy the person is from.

Where do island Greeks fall on this?

Argang
03-16-2014, 05:44 PM
Sicilians and southern Italians are split between a Levantine cluster and a Euro Med one, interesting. I wonder if that depends on where in southern Italy the person is from.

Where do island Greeks fall on this?

I'd guess the Levantine med cluster 5. In this run there's one Greek and one Central Greek in that cluster. It's a 100% fit for the single Maltese sample too.

Thessaly Greeks are clearly different and don't even have noise levels of it.