Log in

View Full Version : Why is the amount of East Eurasian ancestry of Saamis and other Uralics underestimate by some here?



Zanzibar
03-09-2021, 06:05 AM
I have seen some users here underestimate or downplay the East Eurasian ancestry of some Finno-Ugrics such as Saami, saying that they have only minor 5-10% Mongoloid or maybe 15%, acting as if they are not that different from the average Euros, when that's not true at all.

When they along with VURers such as Udmurt, Mari have around 25-35% Mongoloid on average, some are even approaching 45-50% Mongoloid especially when counting groups like Khanty, Mansi and some Turkics like Bashkirs who are literally "balanced" Eurasians.

In Saamis in G25 are around 27% Mongoloid, Saami_Kola are close to 20% Mongoloid, while Mari are closer to 32% East Eurasian, Chuvash and Udmurts are both around 25% East Eurasian. The Mongoloid component is Krasnoyarsk_BA/kra001 which is an ancient Siberian population most closely related to the Nganasan, Yukaghir and Evenk. The amount of East Eurasian that Saamis, these VURers is the literal opposition version of Altaians, Kyrgyzs, Khakass, some Kazakhs who have around 25-32% West Eurasian.


Target: Saami
Distance: 2.8661% / 0.02866055
34.4 Baltic_EST_BA
26.4 RUS_Krasnoyarsk_BA
24.6 FIN_Levanluhta_IA_o
14.6 Yamnaya_KAZ_Mereke

Target: Saami_Kola
Distance: 1.8163% / 0.01816303
44.2 Baltic_EST_BA
21.4 FIN_Levanluhta_IA_o
19.4 RUS_Krasnoyarsk_BA
6.6 Yamnaya_KAZ_Mereke
4.8 SWE_BA
3.6 UKR_Sredny_Stog_II_En

Target: Mari
Distance: 8.7117% / 0.08711679
46.0 UKR_Sredny_Stog_II_En
31.8 RUS_Krasnoyarsk_BA
22.2 Baltic_EST_BA

Target: Chuvash
Distance: 5.7643% / 0.05764286
50.8 UKR_Sredny_Stog_II_En
24.6 Baltic_EST_BA
24.6 RUS_Krasnoyarsk_BA

Target: Udmurt
Distance: 2.2666% / 0.02266610
47.8 UKR_Sredny_Stog_II_En
24.6 RUS_Krasnoyarsk_BA
13.8 Baltic_EST_BA
13.8 Yamnaya_KAZ_Mereke

The distance fits for Mari and Chuvash are horrible though but they both are genetically very drifted in G25.

From the amount of their Mongoloid ancestry, Saami, Chuvash, Mari, Udmurt are literally the reverse/opposition version of Altaian, Kyrgyz and Khakass who are around 25-32% Caucasoid.

Target: Altaian
Distance: 2.1939% / 0.02193893
49.2 MNG_North_N
16.2 RUS_Shamanka_N
10.0 RUS_Afanasievo
8.2 Oroqen
7.6 RUS_Sintashta_MLBA
4.6 TUR_Barcin_N
4.2 TJK_Sarazm_En

Target: Althai_Kizhi
Distance: 2.4724% / 0.02472399
44.6 MNG_North_N
24.4 RUS_Shamanka_N
10.6 RUS_Afanasievo
8.8 RUS_Sintashta_MLBA
5.8 Oroqen
3.2 TUR_Barcin_N
2.6 TJK_Sarazm_En

Target: Kirghiz
Distance: 2.0017% / 0.02001725
23.0 Oroqen
22.4 MNG_North_N
21.4 RUS_Devils_Gate_Cave_N
15.4 RUS_Sintashta_MLBA
8.8 TJK_Sarazm_En
4.6 TUR_Barcin_N
4.4 RUS_Afanasievo

Target: Kirghiz_China
Distance: 2.3150% / 0.02314983
29.2 Oroqen
20.2 RUS_Devils_Gate_Cave_N
19.8 RUS_Sintashta_MLBA
17.4 MNG_North_N
11.2 TJK_Sarazm_En
2.2 TUR_Barcin_N


Target: Khakass
Distance: 2.8290% / 0.02828997
67.6 RUS_Shamanka_N
15.2 RUS_Afanasievo
15.2 RUS_Sintashta_MLBA
1.2 TJK_Sarazm_En
0.8 TUR_Barcin_N


Target: Khakass_Kachins
Distance: 2.3621% / 0.02362103
56.2 RUS_Shamanka_N
16.8 MNG_North_N
12.6 RUS_Afanasievo
8.6 RUS_Sintashta_MLBA
3.6 TUR_Barcin_N
1.8 Oroqen
0.4 TJK_Sarazm_En

Target: Kazakh_China
Distance: 2.3730% / 0.02372980
32.0 Oroqen
20.8 RUS_Devils_Gate_Cave_N
19.2 RUS_Sintashta_MLBA
18.2 MNG_North_N
8.8 TJK_Sarazm_En
1.0 TUR_Barcin_N

Only the Kazakh are more Caucasoid than the Mari, Udmurt, Saami, Chuvash are Mongoloid:

Target: Kazakh
Distance: 1.9419% / 0.01941909
24.2 Oroqen
21.2 MNG_North_N
14.8 RUS_Sintashta_MLBA
14.4 RUS_Devils_Gate_Cave_N
9.0 RUS_Afanasievo
8.6 TJK_Sarazm_En
7.8 TUR_Barcin_N

Therefore, Mari, Chuvash, Udmurt, Saami are literally the opposite version of Altaians, Kyrgyz and Khakass. In my opinion, these Uralics have enough Mongoloid to be seen more of a Hapa or transitional race between Europeans and Asians than only European.

Now if we included the Bashkir, Mansi and Khanty, they are literally Eurasians/Hapas as they are around 47-50% Mongoloid.

Target: Bashkir
Distance: 2.2061% / 0.02206102
52.2 UKR_Sredny_Stog_II_En
30.0 RUS_Shamanka_N
10.6 RUS_Krasnoyarsk_BA
4.0 Baltic_EST_BA
3.2 Yamnaya_KAZ_Mereke

Target: Mansi
Distance: 4.7446% / 0.04744556
48.4 RUS_Krasnoyarsk_BA
32.0 UKR_Sredny_Stog_II_En
16.8 RUS_AfontovaGora3
2.8 Baltic_EST_BA

Target: Khanty
Distance: 4.7254% / 0.04725353
50.0 RUS_Krasnoyarsk_BA
30.6 UKR_Sredny_Stog_II_En
19.4 RUS_AfontovaGora3

Target: Khants
Distance: 4.6029% / 0.04602942
49.6 RUS_Krasnoyarsk_BA
31.0 UKR_Sredny_Stog_II_En
16.8 RUS_AfontovaGora3
2.6 Baltic_EST_BA

P.S.-The Bashkir need Shamanka_N to improve their fits as they have significant Turkic ancestry while surprisingly the Chuvash don't need any Shamanka_N but maybe its because they are genetic drifted, that's why they don't need the input.

Lemminkäinen
03-09-2021, 08:10 AM
Because the Admixture program usually calculates non-European admixtures using European references (assumed to be a zero level) and Eurasian admixtures are present almost everywhere in Europe. But I don't trust in G25 either, because (being Davidski's test?) it is based on PCA components and PCA results depend on the used sample set. If some population or group is underrepredented it gets too little weight and conversely. If it is extremely inbred and even overrepresented it gets too much weight.

Komintasavalta
03-09-2021, 08:39 AM
I tried doing qpAdm models of the population named Saami.DG in the v44.3_HO dataset. I excluded models with one or more negative weight (where feasible is false) and I sorted the models by their p score.

I'm probably doing something wrong, and I still don't know how to pick the outgroups. I mostly just picked outgroups that resulted in little decrease in the number of SNPs that remained after filtering. I also tried to pick left populations that resulted in little decrease in the SNP count.

I got 374794 out of 597573 SNPs after filtering, out of which 349558 were polymorphic.

https://i.ibb.co/vcTCKNz/b.png

In the image above, the models whose p score is above .05 have a constant of about 30-35% Nganasan ancestry. However EHG and CHG and SHG are also part Mongoloid. So if we consider Nganasan to be fully Mongoloid, Saami might also be closer to 40% than 30% Mongoloid.

Both individuals in the population Saami.DG were from Utsjoki, which is part of the Northern Saami region within Finland:


$ awk 'NR==1||/Saami...DG/' g/v44.3_HO_public/v44.3_HO_public.anno|cut -f2,4,9,10|tr \\t \;
Version ID;Publication (or OK to use in a paper);Locality;Country
S_Saami-1.DG;MallickNature2016;Utsjoki;Finland
S_Saami-2.DG;MallickNature2016;Utsjoki;Finland

Among Finnish Saami, there are an estimated 2,000 speakers of Northern Saami, 300 speakers of Inari Saami, and 300 speakers of Skolt Saami (https://fi.wikipedia.org/wiki/Saamelaiskielet). Out of four groups of Saami measured by Karin Mark, Skolt Saami had the lightest pigmentation, followed by Inari Saami, Finnish Northern Saami, and Kola Saami (https://www.etis.ee/Portal/Publications/Display/1fd319c0-7408-4e31-9f18-b9b3010eabad).

Scandinavian Northern Saami might be even more Mongoloid than Finnish Northern Saami, or at least Coon wrote that the Saami of the Scandinavian inland were the darkest and most brachycephalic (https://www.theapricity.com/snpa/chapter-IX2.htm):


The selected "pure" groups, Bryn's Reindeer Lapps, and some of Geyer's mountain and forest Lapps from Sweden, have seventy per cent or over of this dark hair, while the fairest Lapps, with a majority of brown and blond shades, are found in Finland and in the Kola Peninsula.

Pure dark eyes are found among one-third of Reindeer Lapps, and among as few as eight per cent in the total of Lapps from Norway.[14] Pure light and light-mixed eyes are commonest among the Lapps of Finland, where they total between thirty and forty per cent, and least common among the Reindeer Lapps of interior Norway and Sweden. Even among the purest selected sub-groups, such as that of Geyer, who isolated from a larger Swedish Lapp sample a few individuals of most pronounced Lappish type, at least a third are light or light-mixed in iris color. [...]

There are, however, regional differences; the center of extreme round headedness lies among the inland groups in northern Norway, while the Swedish, Finnish, and Kola Peninsula Lapps become progressively narrower headed. The mean for the purest Reindeer Lapps of Norway is 87; for the easternmost Lapps, 80 to 83.

Code for ADMIXTOOLS 2:


target="Saami.DG"
left=c("Turkey_Boncuklu_N.SG","Armenia_Caucasus_KuraAraxes","Latvia_HG","Sweden_Motala_HG","Russia_HG_Karelia","Russia_HG_Tyumen","Nganasan")
right=c("Mbuti.DG","Mixe.DG","Ami.DG","Papuan.DG","Chimp.REF","Ju_hoan_North","Biaka.DG","Yoruba.DG","Altai_Neanderthal.DG")

pops=c(left,right,target)

unlink("/tmp/f2",recursive=T)
extract_f2(pref="g/v44.3_HO_public/v44.3_HO_public",pops=pops,outdir="/tmp/f2")
f2=f2_from_precomp("/tmp/f2")
qp=qpadm(f2,left=left,right=right,target=target)

qp2=qp$popdrop%>%dplyr::filter(feasible==T&f4rank!=0)%>%arrange(desc(p))%>%dplyr::select(!c(wt,dof,chisq,f4rank,dofdiff,chis qdiff,p_nested,feasible,best,dofdiff,chisqdiff,p_n ested))
write_csv(qp2,"/tmp/qp")

Code to generate the bar chart:


library(tidyverse)
library(reshape2)
library(colorspace)

t=read_csv("/tmp/qp")

# t=t[t$p>.05,]

pvalue=sub("^0","",sprintf("%.3f",t$p))
t=t[-2]
t2=melt(t,id.var="pat")

ggplot(t2,aes(x=fct_rev(factor(pat,level=t$pat)),y =value,fill=variable))+
geom_bar(stat="identity",width=1,position=position_fill(reverse=T))+
geom_text(aes(label=round(100*value)),position=pos ition_stack(vjust=.5,reverse=T),size=3.5)+
coord_flip()+
theme(
axis.text.x=element_blank(),
axis.text=element_text(color="black"),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
legend.box.just="center",
legend.box.margin=margin(0),
legend.box.spacing=unit(.05,"in"),
legend.direction="vertical",
legend.justification="center",
legend.margin=margin(0),
legend.text=element_text(size=12),
legend.title=element_blank(),
panel.border=element_blank(),
text=element_text(size=16)
)+
xlab("")+
scale_x_discrete(labels=rev(pvalue),expand=c(0,0)) +
scale_y_discrete(expand=c(0,0))+
scale_fill_manual("legend",values=hex(HSV(c(45,45,210,210,120,120,300),c(.6, .6,.6,.6,.6,.6,.6),c(1,.6,1,.6,1,.6,1))))
ggsave("/tmp/a.png",width=7,height=7)

Zoro
03-09-2021, 10:31 AM
.........

Zoro
03-09-2021, 10:32 AM
I tried doing qpAdm models of the population named Saami.DG in the v44.3_HO dataset. I excluded models with one or more negative weight (where feasible is false) and I sorted the models by their p score.

I'm probably doing something wrong, and I still don't know how to pick the outgroups. I mostly just picked outgroups that resulted in little decrease in the number of SNPs that remained after filtering. I also tried to pick left populations that resulted in little decrease in the SNP count.

I got 374794 out of 597573 SNPs after filtering, out of which 349558 were polymorphic.

https://i.ibb.co/vcTCKNz/b.png

In the image above, the models whose p score is above .05 have a constant of about 30-35% Nganasan ancestry. However EHG and CHG and SHG are also part Mongoloid. So if we consider Nganasan to be fully Mongoloid, Saami might also be closer to 40% than 30% Mongoloid.

Both individuals in the population Saami.DG were from Utsjoki, which is part of the Northern Saami region within Finland:


$ awk 'NR==1||/Saami...DG/' g/v44.3_HO_public/v44.3_HO_public.anno|cut -f2,4,9,10|tr \\t \;
Version ID;Publication (or OK to use in a paper);Locality;Country
S_Saami-1.DG;MallickNature2016;Utsjoki;Finland
S_Saami-2.DG;MallickNature2016;Utsjoki;Finland

Among Finnish Saami, there are an estimated 2,000 speakers of Northern Saami, 300 speakers of Inari Saami, and 300 speakers of Skolt Saami (https://fi.wikipedia.org/wiki/Saamelaiskielet). Out of four groups of Saami measured by Karin Mark, Skolt Saami had the lightest pigmentation, followed by Inari Saami, Finnish Northern Saami, and Kola Saami (https://www.etis.ee/Portal/Publications/Display/1fd319c0-7408-4e31-9f18-b9b3010eabad).

Scandinavian Northern Saami might be even more Mongoloid than Finnish Northern Saami, or at least Coon wrote that the Saami of the Scandinavian inland were the darkest and most brachycephalic (https://www.theapricity.com/snpa/chapter-IX2.htm):


The selected "pure" groups, Bryn's Reindeer Lapps, and some of Geyer's mountain and forest Lapps from Sweden, have seventy per cent or over of this dark hair, while the fairest Lapps, with a majority of brown and blond shades, are found in Finland and in the Kola Peninsula.

Pure dark eyes are found among one-third of Reindeer Lapps, and among as few as eight per cent in the total of Lapps from Norway.[14] Pure light and light-mixed eyes are commonest among the Lapps of Finland, where they total between thirty and forty per cent, and least common among the Reindeer Lapps of interior Norway and Sweden. Even among the purest selected sub-groups, such as that of Geyer, who isolated from a larger Swedish Lapp sample a few individuals of most pronounced Lappish type, at least a third are light or light-mixed in iris color. [...]

There are, however, regional differences; the center of extreme round headedness lies among the inland groups in northern Norway, while the Swedish, Finnish, and Kola Peninsula Lapps become progressively narrower headed. The mean for the purest Reindeer Lapps of Norway is 87; for the easternmost Lapps, 80 to 83.

Code for ADMIXTOOLS 2:


target="Saami.DG"
left=c("Turkey_Boncuklu_N.SG","Armenia_Caucasus_KuraAraxes","Latvia_HG","Sweden_Motala_HG","Russia_HG_Karelia","Russia_HG_Tyumen","Nganasan")
right=c("Mbuti.DG","Mixe.DG","Ami.DG","Papuan.DG","Chimp.REF","Ju_hoan_North","Biaka.DG","Yoruba.DG","Altai_Neanderthal.DG")

pops=c(left,right,target)

unlink("/tmp/f2",recursive=T)
extract_f2(pref="g/v44.3_HO_public/v44.3_HO_public",pops=pops,outdir="/tmp/f2")
f2=f2_from_precomp("/tmp/f2")
qp=qpadm(f2,left=left,right=right,target=target)

qp2=qp$popdrop%>%dplyr::filter(feasible==T&f4rank!=0)%>%arrange(desc(p))%>%dplyr::select(!c(wt,dof,chisq,f4rank,dofdiff,chis qdiff,p_nested,feasible,best,dofdiff,chisqdiff,p_n ested))
write_csv(qp2,"/tmp/qp")

Code to generate the bar chart:


library(tidyverse)
library(reshape2)
library(colorspace)

t=read_csv("/tmp/qp")

# t=t[t$p>.05,]

pvalue=sub("^0","",sprintf("%.3f",t$p))
t=t[-2]
t2=melt(t,id.var="pat")

ggplot(t2,aes(x=fct_rev(factor(pat,level=t$pat)),y =value,fill=variable))+
geom_bar(stat="identity",width=1,position=position_fill(reverse=T))+
geom_text(aes(label=round(100*value)),position=pos ition_stack(vjust=.5,reverse=T),size=3.5)+
coord_flip()+
theme(
axis.text.x=element_blank(),
axis.text=element_text(color="black"),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
legend.box.just="center",
legend.box.margin=margin(0),
legend.box.spacing=unit(.05,"in"),
legend.direction="vertical",
legend.justification="center",
legend.margin=margin(0),
legend.text=element_text(size=12),
legend.title=element_blank(),
panel.border=element_blank(),
text=element_text(size=16)
)+
xlab("")+
scale_x_discrete(labels=rev(pvalue),expand=c(0,0)) +
scale_y_discrete(expand=c(0,0))+
scale_fill_manual("legend",values=hex(HSV(c(45,45,210,210,120,120,300),c(.6, .6,.6,.6,.6,.6,.6),c(1,.6,1,.6,1,.6,1))))
ggsave("/tmp/a.png",width=7,height=7)

I like how you used R to visualize the results and sort by p-value and that you posted your details on how you ran (although I can't see your SE). So looking at the details here's how you can improve the accuracy of your models:

1- Pright are used as references to distinguish between various sources used to model. Therefore, they should be pretty diverse in their ancestry. I noticed that Africans are over represented in pright. You really only need one group of Africans unless you are trying to model a target using multiple African pops. I would just keep Mbuti and drop the rest of the Africans from pright

2- Neanderthal and Chimp are pretty useless in differentiating between different Eurasians because all Eurasians are pretty much similarly related to them (Even with Neanderthal the difference between various Eurasians is just a couple of percent). Drop them

3- I would add Mesolithic Ancient Siberian Kolyma-Diploid to pright because some of your sources are quite differentially related to it because of its mix of ancient E Asian and Siberian (Yana type)

4- I would also for sure add CHG ( the sample labeled Kotias KK1 has the highest number of SNPs in the dataset you are using) because some of your sources are quite differentially related to it

5- I would also for sure add WHG ( the sample labeled Bichon Bichon has the highest number of SNPs in the dataset you are using) because some of your sources are quite differentially related to it

6- I would also add Tyumen to pright and use Devils-Gate in sources for Neolithic E Asian

7- I would add Iran-N to pright

8- I would add Iberouma . These have decent SNPs
Morocco_Iberomaurusian TAF010
Morocco_Iberomaurusian TAF011
Morocco_Iberomaurusian TAF013
Morocco_Iberomaurusian TAF014

9- I would add Kostenki14 I0876 to pright. It has many SNPs

10- I would add GoyetQ116-1 Q116-1 to pright

11- Drop Nganasan from sources and add Shamanka-EN instead. It's always better to keep it cosistently Ancients. Shamanka would be more ancestral to Uralics than Nganasan.

These are decent ones

Russia_Shamanka_Eneolithic DA245 3772618 4676043 0.81
Russia_Shamanka_Eneolithic DA246 3680753 4668444 0.79
Russia_Shamanka_Eneolithic DA247 3733439 4676043 0.80
Russia_Shamanka_Eneolithic DA248 3744767 4676043 0.80
Russia_Shamanka_Eneolithic DA249 3590256 4668444 0.77
Russia_Shamanka_Eneolithic DA252 3732230 4668444 0.80
Russia_Shamanka_Eneolithic DA253 3703323 4668444 0.79

12- You really need Anatolia-N in pright also if you're not going to use it as a source.


Your SNPs will drop and p-values but your models will be significantly more accurate and your SE should be better too.

Don't forget to download the latest Admixtools 2. It has some significant fixes.

Lemminkäinen
03-09-2021, 10:46 AM
I have always used the right group used previously by studies or Davidski to have something to compare with. In principle I undersand that the right group should be build of ancestral populations of common ancestry for all left population, being enough archaic to cover all left populations, not being remarkable dominant for any of them, but not too distant to be at equal distant for all.

P-values lower than 0.05 are significat compared to the null hypothesis.

Zoro
03-09-2021, 11:33 AM
I have seen some users here underestimate or downplay the East Eurasian ancestry of some Finno-Ugrics such as Saami, saying that they have only minor 5-10% Mongoloid or maybe 15%, acting as if they are not that different from the average Euros, when that's not true at all.



E Asians have higher genetic similarity with Saamis than other mainland Europeans. I wouldn't rely on G25 for that though because the results can be misleading. In general with any calculator the amount of E Asian will change depending on what other components the calculator uses.

You have to do a gene to gene comparison between E Asian and each European population one at a time to get an accurate picture. Here is IBS similarity with Mongola sample based on Plink --genome flag using 400,000 SNPs.

As you can see Mongola has about the same amount of IBS with Saamis as with some S Asians and not that much more than Iraqi Kurd or some Finns which would be a shocker to you if you just went by calculator results.

<style type="text/css">td {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}</style>
<tbody>
NO
FID1
FID2
IID2
PI_HAT
IBS


1
S_Mongola-1
Korean
S_Korean-1
0.157
0.81068


2
S_Mongola-1
Han
S_Han-1
0.1538
0.81020


3
S_Mongola-1
Japanese
S_Japanese-1
0.1603
0.80999


4
S_Mongola-1
Xibo
S_Xibo-2
0.1463
0.80968


5
S_Mongola-1
Korean
S_Korean-2
0.1546
0.80955


6
S_Mongola-1
Han
S_Han-2
0.1562
0.80910


7
S_Mongola-1
Tujia
S_Tujia-2
0.1522
0.80896


8
S_Mongola-1
Japanese
S_Japanese-2
0.148
0.80880


9
S_Mongola-1
She
S_She-1
0.1542
0.80875


10
S_Mongola-1
She
S_She-2
0.1535
0.80870


11
S_Mongola-1
Naxi
S_Naxi-1
0.1527
0.80869


12
S_Mongola-1
Japanese
S_Japanese-3
0.1426
0.80865


13
S_Mongola-1
Hezhen
S_Hezhen-2
0.1438
0.80863


14
S_Mongola-1
Yi
S_Yi-1
0.1494
0.80853


15
S_Mongola-1
Xibo
S_Xibo-1
0.1408
0.80837


16
S_Mongola-1
Miao
S_Miao-2
0.1534
0.80827


17
S_Mongola-1
Kinh
S_Kinh-1
0.1488
0.80800


18
S_Mongola-1
Naxi
S_Naxi-3
0.1516
0.80795


19
S_Mongola-1
Hezhen
S_Hezhen-1
0.1514
0.80782


20
S_Mongola-1
Tujia
S_Tujia-1
0.1519
0.80772


21
S_Mongola-1
Mongola
S_Mongola-2
0.1456
0.80755


22
S_Mongola-1
Miao
S_Miao-1
0.1518
0.80748


23
S_Mongola-1
Ulchi
S_Ulchi-1
0.1642
0.80746


24
S_Mongola-1
Oroqen
S_Oroqen-1
0.1575
0.80745


25
S_Mongola-1
Yi
S_Yi-2
0.1529
0.80724


26
S_Mongola-1
Daur
S_Daur-2
0.1422
0.80716


27
S_Mongola-1
Ulchi
S_Ulchi-2
0.1566
0.80713


28
S_Mongola-1
Oroqen
S_Oroqen-2
0.1588
0.80693


29
S_Mongola-1
Dai
S_Dai-1
0.1463
0.80672


30
S_Mongola-1
Even
S_Even-3
0.1583
0.80661


31
S_Mongola-1
Dai
S_Dai-2
0.1519
0.80603


32
S_Mongola-1
Tu
S_Tu-2
0.1387
0.80580


33
S_Mongola-1
Kinh
S_Kinh-2
0.1415
0.80574


34
S_Mongola-1
Thai
S_Thai-2
0.1401
0.80573


35
S_Mongola-1
China_Lahu
S_Lahu-1
0.1524
0.80558


36
S_Mongola-1
Burmese
S_Burmese-1
0.1385
0.80540


37
S_Mongola-1
Tu
S_Tu-1
0.1354
0.80530


38
S_Mongola-1
Ami.DG
S_Ami1
0.1575
0.80503


39
S_Mongola-1
Ami.DG
S_Ami2
0.1595
0.80502


40
S_Mongola-1
Even
S_Even-2
0.1555
0.80488


41
S_Mongola-1
Yakut
S_Yakut-1
0.1485
0.80419


42
S_Mongola-1
China_Lahu
S_Lahu-2
0.1523
0.80397


43
S_Mongola-1
Igorot
S_Igorot-2
0
0.80313


44
S_Mongola-1
Dusun
S_Dusun-2
0
0.80309


45
S_Mongola-1
Dusun
S_Dusun-1
0
0.80308


46
S_Mongola-1
Thai
S_Thai-1
0.1275
0.80306


47
S_Mongola-1
Igorot
S_Igorot-1
0
0.80301


48
S_Mongola-1
Cambodian
S_Cambodian-1
0.1407
0.80241


49
S_Mongola-1
Even
S_Even-1
0.1214
0.80214


50
S_Mongola-1
Burmese
S_Burmese-2
0.1169
0.80213


51
S_Mongola-1
Yakut
S_Yakut-2
0.1438
0.80209


52
S_Mongola-1
Cambodian
S_Cambodian-2
0.134
0.80188


53
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki1
0
0.80124


54
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-1
0.1127
0.79908


55
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-2
0.1005
0.79815


56
S_Mongola-1
Itelmen
S_Itelman-1
0
0.79809


57
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan2
0
0.79789


58
S_Mongola-1
Eskimo_Chaplin.DG
S_Chaplin1
0
0.79770


59
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan1
0
0.79751


60
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki2
0
0.79749


61
S_Mongola-1
Kusunda
S_Kusunda-1
0.1132
0.79740


62
S_Mongola-1
Tubalar
S_Tubalar-2
0
0.79509


63
S_Mongola-1
Tubalar
S_Tubalar-1
0.1107
0.79490


64
S_Mongola-1
Chukchi
S_Chukchi-1
0.0841
0.79357


65
S_Mongola-1
Uyghur
S_Uygur-1
0.0898
0.79336


66
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec1
0
0.79282


67
S_Mongola-1
Mansi
S_Mansi-1
0
0.79238


68
S_Mongola-1
Hazara
S_Hazara-1
0
0.79204


69
S_Mongola-1
Pima
S_Pima-1
0
0.79198


70
S_Mongola-1
Uyghur
S_Uygur-2
0
0.79197


71
S_Mongola-1
Hazara
S_Hazara-2
0
0.79170


72
S_Mongola-1
Mayan
S_Mayan-2
0
0.79120


73
S_Mongola-1
Mixtec
S_Mixtec-1
0
0.79120


74
S_Mongola-1
Mixe
S_Mixe-2
0
0.79115


75
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec2
0
0.79101


76
S_Mongola-1
Mayan
S_Mayan-1
0
0.79087


77
S_Mongola-1
Quechua
S_Quechua-3
0
0.79075


78
S_Mongola-1
Mixe
S_Mixe-3
0
0.79044


79
S_Mongola-1
Piapoco
S_Piapoco-2
0
0.79029


80
S_Mongola-1
Quechua
S_Quechua-1
0
0.79023


81
S_Mongola-1
Quechua
S_Quechua-2
0
0.78995


82
S_Mongola-1
Pima
S_Pima-2
0
0.78978


83
S_Mongola-1
Mansi
S_Mansi-2
0
0.78962


84
S_Mongola-1
Khonda_Dora
S_Khonda_Dora-1
0
0.78847


85
S_Mongola-1
Tlingit
S_Tlingit-2
0
0.78816


86
S_Mongola-1
Mixtec
S_Mixtec-2
0
0.78811


87
S_Mongola-1
Maori
S_Maori-1
0.0542
0.78805


88
S_Mongola-1
Piapoco
S_Piapoco-1
0
0.78747


89
S_Mongola-1
Karitiana
S_Karitiana-2
0
0.78742


90
S_Mongola-1
Surui
S_Surui-1
0
0.78727


91
S_Mongola-1
Surui
S_Surui-2
0
0.78565


92
S_Mongola-1
Karitiana
S_Karitiana-1
0
0.78561


93
S_Mongola-1
Bengali
S_Bengali-1
0
0.78436


94
S_Mongola-1
Kusunda
S_Kusunda-2
0
0.78408


95
S_Mongola-1
Tlingit
S_Tlingit-1
0
0.78388


96
S_Mongola-1
Relli
S_Relli-1
0
0.78344


97
S_Mongola-1
Kapu
S_Kapu-2
0
0.78280


98
S_Mongola-1
Madiga
S_Madiga-1
0
0.78227


99
S_Mongola-1
Madiga
S_Madiga-2
0
0.78175


100
S_Mongola-1
Mala
S_Mala-3
0
0.78161


101
S_Mongola-1
Yadava
S_Yadava-1
0
0.78157


102
S_Mongola-1
Bengali
S_Bengali-2
0
0.78140


103
S_Mongola-1
Kapu
S_Kapu-1
0
0.78130


104
S_Mongola-1
Irula
S_Irula-2
0
0.78128


105
S_Mongola-1
Mala
S_Mala-2
0
0.78128


106
S_Mongola-1
Punjabi
S_Punjabi-1
0
0.78107


107
S_Mongola-1
Irula
S_Irula-1
0
0.78107


108
S_Mongola-1
Burusho
S_Burusho-2
0
0.78081


109
S_Mongola-1
Yadava
S_Yadava-2
0
0.78078


110

S_Mongola-1
Saami
S_Saami-1
0
0.78063


111
S_Mongola-1
Brahmin
S_Brahmin-2
0
0.78031


112

S_Mongola-1
Saami
S_Saami-2
0
0.78012


113
S_Mongola-1
Relli
S_Relli-2
0
0.77974


114
S_Mongola-1
Punjabi
S_Punjabi-3
0
0.77920


115
S_Mongola-1
Bougainville
S_Bougainville-1
0
0.77900


116
S_Mongola-1
Burusho
S_Burusho-1
0
0.77885


117
S_Mongola-1
Punjabi
S_Punjabi-2
0
0.77885


118
S_Mongola-1
Brahmin
S_Brahmin-1
0
0.77874


119
S_Mongola-1
Bougainville
S_Bougainville-2
0
0.77866


120
S_Mongola-1
Sindhi
S_Sindhi-2
0
0.77851


121
S_Mongola-1
Pathan
S_Pathan-1
0
0.77838


122
S_Mongola-1
Punjabi
S_Punjabi-4
0
0.77776


123

S_Mongola-1
Kurd-Iraq
WGS
0
0.77625


124
S_Mongola-1
Pathan
S_Pathan-2
0
0.77597


125
S_Mongola-1
Ossetian-North
S_Ossetian-1
0
0.77575


126
S_Mongola-1
Russian
S_Russian-1
0
0.77570


127

S_Mongola-1
Finnish
S_Finnish-1
0
0.77476


128
S_Mongola-1
Sindhi
S_Sindhi-1
0
0.77473


129
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-1
0
0.77463


130
S_Mongola-1
Tajik
S_Tajik-2
0
0.77448


131
S_Mongola-1
YANA_UP_WGS
Yana1
0
0.77422


132
S_Mongola-1
Ossetian-North
S_Ossetian-2
0
0.77413


133
S_Mongola-1
Papuan
S_Papuan-10
0
0.77381


134
S_Mongola-1
Balochi
S_Balochi-2
0
0.77365


135
S_Mongola-1
Brahui
S_Brahui-1
0
0.77363


136
S_Mongola-1
Adygei
S_Adygei-1
0
0.77334


137
S_Mongola-1
Makrani
S_Makrani-1
0
0.77334


138
S_Mongola-1
Finnish
S_Finnish-3
0
0.77319


139
S_Mongola-1
Adygei
S_Adygei-2
0
0.77319


140
S_Mongola-1
Kalash
S_Kalash-2
0
0.77319


141
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-2
0
0.77319


142
S_Mongola-1
Chechen
S_Chechen-1
0
0.77312


143
S_Mongola-1
Papuan
S_Papuan-9
0
0.77307


144
S_Mongola-1
Russian
S_Russian-2
0
0.77288


145
S_Mongola-1
Icelandic
S_Icelandic-1
0
0.77260


146
S_Mongola-1
Finnish
S_Finnish-2
0
0.77258


147
S_Mongola-1
Papuan
S_Papuan-12
0
0.77257


148
S_Mongola-1
Kalash
S_Kalash-1
0
0.77247


149
S_Mongola-1
Lezgin
S_Lezgin-1
0
0.77245


150
S_Mongola-1
Papuan
S_Papuan-8
0
0.77232


151
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-1
0
0.77197


152
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-1
0
0.77194


153
S_Mongola-1
Brahui
S_Brahui-2
0
0.77178


154
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-2
0
0.77170


155
S_Mongola-1
Papuan
S_Papuan-1
0
0.77164


156
S_Mongola-1
Norwegian
S_Norwegian-1
0
0.77159


157
S_Mongola-1
Orcadian
S_Orcadian-2
0
0.77158


158
S_Mongola-1
Estonian
S_Estonian-1
0
0.77155


159
S_Mongola-1
Papuan
S_Papuan-7
0
0.77150


160
S_Mongola-1
Papuan
S_Papuan-11
0
0.77146


161
S_Mongola-1
Estonian
S_Estonian-2
0
0.77144


162
S_Mongola-1
Papuan
S_Papuan-13
0
0.77131


163
S_Mongola-1
Tajik
S_Tajik-1
0
0.77131


164
S_Mongola-1
Papuan
S_Papuan-14
0
0.77129


165
S_Mongola-1
Hungarian
S_Hungarian-2
0
0.77120


166
S_Mongola-1
Czech
S_Czech-2
0
0.77120


167
S_Mongola-1
Papuan
S_Papuan-3
0
0.77119


168
S_Mongola-1
Icelandic
S_Icelandic-2
0
0.77119


169
S_Mongola-1
Hungarian
S_Hungarian-1
0
0.77111


170
S_Mongola-1
Polish
S_Polish-1
0
0.77110


171
S_Mongola-1
Bulgarian
S_Bulgarian-1
0
0.77106


172
S_Mongola-1
Greek
S_Greek-1
0
0.77103


173
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-2
0
0.77103


174
S_Mongola-1
Papuan
S_Papuan-5
0
0.77101


175
S_Mongola-1
French
S_French-2
0
0.77082


176
S_Mongola-1
Georgian
S_Georgian-1
0
0.77071


177
S_Mongola-1
Balochi
S_Balochi-1
0
0.77062


178
S_Mongola-1
Spanish
S_Spanish-1
0
0.77061


179
S_Mongola-1
Armenian
S_Armenian-1
0
0.77054


180
S_Mongola-1
Papuan
S_Papuan-6
0
0.77049


181
S_Mongola-1
Bergamo
S_Bergamo-2
0
0.77017


182
S_Mongola-1
Papuan
S_Papuan-2
0
0.77008


183
S_Mongola-1
Bulgarian
S_Bulgarian-2
0
0.77007


184
S_Mongola-1
Papuan
S_Papuan-4
0
0.77005


185
S_Mongola-1
Spanish
S_Spanish-2
0
0.76981


186
S_Mongola-1
Greek
S_Greek-2
0
0.76981


187
S_Mongola-1
Basque
S_Basque-1
0
0.76979


188
S_Mongola-1
English
S_English-1
0
0.76977


189
S_Mongola-1
Lezgin
S_Lezgin-2
0
0.76975


190
S_Mongola-1
Tuscan
S_Tuscan-2
0
0.76960


191
S_Mongola-1
Albanian.DG
S_Albanian1
0
0.76953


192
S_Mongola-1
English
S_English-2
0
0.76951


193
S_Mongola-1
Armenian
S_Armenian-2
0
0.76950


194
S_Mongola-1
Sardinian
S_Sardinian-2
0
0.76946


195
S_Mongola-1
Orcadian
S_Orcadian-1
0
0.76909


196
S_Mongola-1
Tuscan
S_Tuscan-1
0
0.76906


197
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-1
0
0.76901


198
S_Mongola-1
Basque
S_Basque-2
0
0.76888


199
S_Mongola-1
Georgian
S_Georgian-2
0
0.76886


200
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-2
0
0.76865


201
S_Mongola-1
Jordanian
S_Jordanian-3
0
0.76809


202
S_Mongola-1
French
S_French-1
0
0.76796


203
S_Mongola-1
BedouinB
S_BedouinB-2
0
0.76779


204
S_Mongola-1
Druze
S_Druze-1
0
0.76757


205
S_Mongola-1
Druze
S_Druze-2
0
0.76754


206
S_Mongola-1
Makrani
S_Makrani-2
0
0.76747


207
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-2
0
0.76622


208
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-1
0
0.76575


209
S_Mongola-1
Sardinian
S_Sardinian-1
0
0.76564


210
S_Mongola-1
BedouinB
S_BedouinB-1
0
0.76460


211
S_Mongola-1
Jordanian
S_Jordanian-2
0
0.76413


212
S_Mongola-1
Samaritan
S_Samaritan-1
0
0.76396


213
S_Mongola-1
Jordanian
S_Jordanian-1
0
0.76261


214
S_Mongola-1
Saharawi
S_Saharawi-2
0
0.75981


215
S_Mongola-1
Saharawi
S_Saharawi-1
0
0.75964


216
S_Mongola-1
Mozabite
S_Mozabite-1
0
0.75937


217
S_Mongola-1
Mozabite
S_Mozabite-2
0
0.75824


222
S_Mongola-1
Somali
S_Somali-1
0
0.74788


224
S_Mongola-1
Masai
S_Masai-2
0
0.74381


226
S_Mongola-1
Masai
S_Masai-1
0
0.74274


232
S_Mongola-1
Gambian
S_Gambian-2
0
0.73200


233
S_Mongola-1
BantuKenya
S_BantuKenya-1
0
0.73139


234
S_Mongola-1
Luo
S_Luo-2
0
0.73107


235
S_Mongola-1
BantuKenya
S_BantuKenya-2
0
0.73020


236
S_Mongola-1
Luhya
S_Luhya-1
0
0.73005


237
S_Mongola-1
Luhya
S_Luhya-2
0
0.73002


238
S_Mongola-1
Mandenka
S_Mandenka-2
0
0.72934


239
S_Mongola-1
Gambian
S_Gambian-1
0
0.72933


240
S_Mongola-1
Esan
S_Esan-2
0
0.72920


241
S_Mongola-1
Yoruba
S_Yoruba-2
0
0.72879


242
S_Mongola-1
Mandenka
S_Mandenka-1
0
0.72872


243
S_Mongola-1
Yoruba
S_Yoruba-1
0
0.72816


244
S_Mongola-1
Esan
S_Esan-1
0
0.72810


245
S_Mongola-1
Mende
S_Mende-1
0
0.72793


246
S_Mongola-1
Mende
S_Mende-2
0
0.72788


247
S_Mongola-1
Biaka
S_Biaka-1
0
0.72484


248
S_Mongola-1
Biaka
S_Biaka-2
0
0.72347


249
S_Mongola-1
Mbuti
S_Mbuti-3
0
0.72046


250
S_Mongola-1
Mbuti
S_Mbuti-1
0
0.72010


251
S_Mongola-1
Mbuti
S_Mbuti-2
0
0.72005


252
S_Mongola-1
Khomani_San
S_Khomani_San-2
0
0.71521


253
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-2
0
0.71514


254
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-3
0
0.71460


255
S_Mongola-1
Khomani_San
S_Khomani_San-1
0
0.71302

</tbody>

Zoro
03-09-2021, 01:41 PM
It's pretty amazing that the above IBS list was able to properly order Mbuti, Khomani, and Ju-Hoan in terms of IBS with Mongola.

Does anyone know why Mongola is slightly closer to Mbuti than Khomani and Ju-Hoan ?

Hint: The late paleolithic African paper :)

Zanzibar
03-09-2021, 02:00 PM
E Asians have higher genetic similarity with Saamis than other mainland Europeans. I wouldn't rely on G25 for that though because the results can be misleading. In general with any calculator the amount of E Asian will change depending on what other components the calculator uses.

You have to do a gene to gene comparison between E Asian and each European population one at a time to get an accurate picture. Here is IBS similarity with Mongola sample based on Plink --genome flag using 400,000 SNPs.

As you can see Mongola has about the same amount of IBS with Saamis as with some S Asians and not that much more than Iraqi Kurd or some Finns which would be a shocker to you if you just went by calculator results.

<style type="text/css">td {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}</style>
<tbody>
NO
FID1
FID2
IID2
PI_HAT
IBS


1
S_Mongola-1
Korean
S_Korean-1
0.157
0.81068


2
S_Mongola-1
Han
S_Han-1
0.1538
0.81020


3
S_Mongola-1
Japanese
S_Japanese-1
0.1603
0.80999


4
S_Mongola-1
Xibo
S_Xibo-2
0.1463
0.80968


5
S_Mongola-1
Korean
S_Korean-2
0.1546
0.80955


6
S_Mongola-1
Han
S_Han-2
0.1562
0.80910


7
S_Mongola-1
Tujia
S_Tujia-2
0.1522
0.80896


8
S_Mongola-1
Japanese
S_Japanese-2
0.148
0.80880


9
S_Mongola-1
She
S_She-1
0.1542
0.80875


10
S_Mongola-1
She
S_She-2
0.1535
0.80870


11
S_Mongola-1
Naxi
S_Naxi-1
0.1527
0.80869


12
S_Mongola-1
Japanese
S_Japanese-3
0.1426
0.80865


13
S_Mongola-1
Hezhen
S_Hezhen-2
0.1438
0.80863


14
S_Mongola-1
Yi
S_Yi-1
0.1494
0.80853


15
S_Mongola-1
Xibo
S_Xibo-1
0.1408
0.80837


16
S_Mongola-1
Miao
S_Miao-2
0.1534
0.80827


17
S_Mongola-1
Kinh
S_Kinh-1
0.1488
0.80800


18
S_Mongola-1
Naxi
S_Naxi-3
0.1516
0.80795


19
S_Mongola-1
Hezhen
S_Hezhen-1
0.1514
0.80782


20
S_Mongola-1
Tujia
S_Tujia-1
0.1519
0.80772


21
S_Mongola-1
Mongola
S_Mongola-2
0.1456
0.80755


22
S_Mongola-1
Miao
S_Miao-1
0.1518
0.80748


23
S_Mongola-1
Ulchi
S_Ulchi-1
0.1642
0.80746


24
S_Mongola-1
Oroqen
S_Oroqen-1
0.1575
0.80745


25
S_Mongola-1
Yi
S_Yi-2
0.1529
0.80724


26
S_Mongola-1
Daur
S_Daur-2
0.1422
0.80716


27
S_Mongola-1
Ulchi
S_Ulchi-2
0.1566
0.80713


28
S_Mongola-1
Oroqen
S_Oroqen-2
0.1588
0.80693


29
S_Mongola-1
Dai
S_Dai-1
0.1463
0.80672


30
S_Mongola-1
Even
S_Even-3
0.1583
0.80661


31
S_Mongola-1
Dai
S_Dai-2
0.1519
0.80603


32
S_Mongola-1
Tu
S_Tu-2
0.1387
0.80580


33
S_Mongola-1
Kinh
S_Kinh-2
0.1415
0.80574


34
S_Mongola-1
Thai
S_Thai-2
0.1401
0.80573


35
S_Mongola-1
China_Lahu
S_Lahu-1
0.1524
0.80558


36
S_Mongola-1
Burmese
S_Burmese-1
0.1385
0.80540


37
S_Mongola-1
Tu
S_Tu-1
0.1354
0.80530


38
S_Mongola-1
Ami.DG
S_Ami1
0.1575
0.80503


39
S_Mongola-1
Ami.DG
S_Ami2
0.1595
0.80502


40
S_Mongola-1
Even
S_Even-2
0.1555
0.80488


41
S_Mongola-1
Yakut
S_Yakut-1
0.1485
0.80419


42
S_Mongola-1
China_Lahu
S_Lahu-2
0.1523
0.80397


43
S_Mongola-1
Igorot
S_Igorot-2
0
0.80313


44
S_Mongola-1
Dusun
S_Dusun-2
0
0.80309


45
S_Mongola-1
Dusun
S_Dusun-1
0
0.80308


46
S_Mongola-1
Thai
S_Thai-1
0.1275
0.80306


47
S_Mongola-1
Igorot
S_Igorot-1
0
0.80301


48
S_Mongola-1
Cambodian
S_Cambodian-1
0.1407
0.80241


49
S_Mongola-1
Even
S_Even-1
0.1214
0.80214


50
S_Mongola-1
Burmese
S_Burmese-2
0.1169
0.80213


51
S_Mongola-1
Yakut
S_Yakut-2
0.1438
0.80209


52
S_Mongola-1
Cambodian
S_Cambodian-2
0.134
0.80188


53
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki1
0
0.80124


54
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-1
0.1127
0.79908


55
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-2
0.1005
0.79815


56
S_Mongola-1
Itelmen
S_Itelman-1
0
0.79809


57
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan2
0
0.79789


58
S_Mongola-1
Eskimo_Chaplin.DG
S_Chaplin1
0
0.79770


59
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan1
0
0.79751


60
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki2
0
0.79749


61
S_Mongola-1
Kusunda
S_Kusunda-1
0.1132
0.79740


62
S_Mongola-1
Tubalar
S_Tubalar-2
0
0.79509


63
S_Mongola-1
Tubalar
S_Tubalar-1
0.1107
0.79490


64
S_Mongola-1
Chukchi
S_Chukchi-1
0.0841
0.79357


65
S_Mongola-1
Uyghur
S_Uygur-1
0.0898
0.79336


66
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec1
0
0.79282


67
S_Mongola-1
Mansi
S_Mansi-1
0
0.79238


68
S_Mongola-1
Hazara
S_Hazara-1
0
0.79204


69
S_Mongola-1
Pima
S_Pima-1
0
0.79198


70
S_Mongola-1
Uyghur
S_Uygur-2
0
0.79197


71
S_Mongola-1
Hazara
S_Hazara-2
0
0.79170


72
S_Mongola-1
Mayan
S_Mayan-2
0
0.79120


73
S_Mongola-1
Mixtec
S_Mixtec-1
0
0.79120


74
S_Mongola-1
Mixe
S_Mixe-2
0
0.79115


75
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec2
0
0.79101


76
S_Mongola-1
Mayan
S_Mayan-1
0
0.79087


77
S_Mongola-1
Quechua
S_Quechua-3
0
0.79075


78
S_Mongola-1
Mixe
S_Mixe-3
0
0.79044


79
S_Mongola-1
Piapoco
S_Piapoco-2
0
0.79029


80
S_Mongola-1
Quechua
S_Quechua-1
0
0.79023


81
S_Mongola-1
Quechua
S_Quechua-2
0
0.78995


82
S_Mongola-1
Pima
S_Pima-2
0
0.78978


83
S_Mongola-1
Mansi
S_Mansi-2
0
0.78962


84
S_Mongola-1
Khonda_Dora
S_Khonda_Dora-1
0
0.78847


85
S_Mongola-1
Tlingit
S_Tlingit-2
0
0.78816


86
S_Mongola-1
Mixtec
S_Mixtec-2
0
0.78811


87
S_Mongola-1
Maori
S_Maori-1
0.0542
0.78805


88
S_Mongola-1
Piapoco
S_Piapoco-1
0
0.78747


89
S_Mongola-1
Karitiana
S_Karitiana-2
0
0.78742


90
S_Mongola-1
Surui
S_Surui-1
0
0.78727


91
S_Mongola-1
Surui
S_Surui-2
0
0.78565


92
S_Mongola-1
Karitiana
S_Karitiana-1
0
0.78561


93
S_Mongola-1
Bengali
S_Bengali-1
0
0.78436


94
S_Mongola-1
Kusunda
S_Kusunda-2
0
0.78408


95
S_Mongola-1
Tlingit
S_Tlingit-1
0
0.78388


96
S_Mongola-1
Relli
S_Relli-1
0
0.78344


97
S_Mongola-1
Kapu
S_Kapu-2
0
0.78280


98
S_Mongola-1
Madiga
S_Madiga-1
0
0.78227


99
S_Mongola-1
Madiga
S_Madiga-2
0
0.78175


100
S_Mongola-1
Mala
S_Mala-3
0
0.78161


101
S_Mongola-1
Yadava
S_Yadava-1
0
0.78157


102
S_Mongola-1
Bengali
S_Bengali-2
0
0.78140


103
S_Mongola-1
Kapu
S_Kapu-1
0
0.78130


104
S_Mongola-1
Irula
S_Irula-2
0
0.78128


105
S_Mongola-1
Mala
S_Mala-2
0
0.78128


106
S_Mongola-1
Punjabi
S_Punjabi-1
0
0.78107


107
S_Mongola-1
Irula
S_Irula-1
0
0.78107


108
S_Mongola-1
Burusho
S_Burusho-2
0
0.78081


109
S_Mongola-1
Yadava
S_Yadava-2
0
0.78078


110

S_Mongola-1
Saami
S_Saami-1
0
0.78063


111
S_Mongola-1
Brahmin
S_Brahmin-2
0
0.78031


112

S_Mongola-1
Saami
S_Saami-2
0
0.78012


113
S_Mongola-1
Relli
S_Relli-2
0
0.77974


114
S_Mongola-1
Punjabi
S_Punjabi-3
0
0.77920


115
S_Mongola-1
Bougainville
S_Bougainville-1
0
0.77900


116
S_Mongola-1
Burusho
S_Burusho-1
0
0.77885


117
S_Mongola-1
Punjabi
S_Punjabi-2
0
0.77885


118
S_Mongola-1
Brahmin
S_Brahmin-1
0
0.77874


119
S_Mongola-1
Bougainville
S_Bougainville-2
0
0.77866


120
S_Mongola-1
Sindhi
S_Sindhi-2
0
0.77851


121
S_Mongola-1
Pathan
S_Pathan-1
0
0.77838


122
S_Mongola-1
Punjabi
S_Punjabi-4
0
0.77776


123

S_Mongola-1
Kurd-Iraq
WGS
0
0.77625


124
S_Mongola-1
Pathan
S_Pathan-2
0
0.77597


125
S_Mongola-1
Ossetian-North
S_Ossetian-1
0
0.77575


126
S_Mongola-1
Russian
S_Russian-1
0
0.77570


127

S_Mongola-1
Finnish
S_Finnish-1
0
0.77476


128
S_Mongola-1
Sindhi
S_Sindhi-1
0
0.77473


129
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-1
0
0.77463


130
S_Mongola-1
Tajik
S_Tajik-2
0
0.77448


131
S_Mongola-1
YANA_UP_WGS
Yana1
0
0.77422


132
S_Mongola-1
Ossetian-North
S_Ossetian-2
0
0.77413


133
S_Mongola-1
Papuan
S_Papuan-10
0
0.77381


134
S_Mongola-1
Balochi
S_Balochi-2
0
0.77365


135
S_Mongola-1
Brahui
S_Brahui-1
0
0.77363


136
S_Mongola-1
Adygei
S_Adygei-1
0
0.77334


137
S_Mongola-1
Makrani
S_Makrani-1
0
0.77334


138
S_Mongola-1
Finnish
S_Finnish-3
0
0.77319


139
S_Mongola-1
Adygei
S_Adygei-2
0
0.77319


140
S_Mongola-1
Kalash
S_Kalash-2
0
0.77319


141
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-2
0
0.77319


142
S_Mongola-1
Chechen
S_Chechen-1
0
0.77312


143
S_Mongola-1
Papuan
S_Papuan-9
0
0.77307


144
S_Mongola-1
Russian
S_Russian-2
0
0.77288


145
S_Mongola-1
Icelandic
S_Icelandic-1
0
0.77260


146
S_Mongola-1
Finnish
S_Finnish-2
0
0.77258


147
S_Mongola-1
Papuan
S_Papuan-12
0
0.77257


148
S_Mongola-1
Kalash
S_Kalash-1
0
0.77247


149
S_Mongola-1
Lezgin
S_Lezgin-1
0
0.77245


150
S_Mongola-1
Papuan
S_Papuan-8
0
0.77232


151
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-1
0
0.77197


152
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-1
0
0.77194


153
S_Mongola-1
Brahui
S_Brahui-2
0
0.77178


154
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-2
0
0.77170


155
S_Mongola-1
Papuan
S_Papuan-1
0
0.77164


156
S_Mongola-1
Norwegian
S_Norwegian-1
0
0.77159


157
S_Mongola-1
Orcadian
S_Orcadian-2
0
0.77158


158
S_Mongola-1
Estonian
S_Estonian-1
0
0.77155


159
S_Mongola-1
Papuan
S_Papuan-7
0
0.77150


160
S_Mongola-1
Papuan
S_Papuan-11
0
0.77146


161
S_Mongola-1
Estonian
S_Estonian-2
0
0.77144


162
S_Mongola-1
Papuan
S_Papuan-13
0
0.77131


163
S_Mongola-1
Tajik
S_Tajik-1
0
0.77131


164
S_Mongola-1
Papuan
S_Papuan-14
0
0.77129


165
S_Mongola-1
Hungarian
S_Hungarian-2
0
0.77120


166
S_Mongola-1
Czech
S_Czech-2
0
0.77120


167
S_Mongola-1
Papuan
S_Papuan-3
0
0.77119


168
S_Mongola-1
Icelandic
S_Icelandic-2
0
0.77119


169
S_Mongola-1
Hungarian
S_Hungarian-1
0
0.77111


170
S_Mongola-1
Polish
S_Polish-1
0
0.77110


171
S_Mongola-1
Bulgarian
S_Bulgarian-1
0
0.77106


172
S_Mongola-1
Greek
S_Greek-1
0
0.77103


173
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-2
0
0.77103


174
S_Mongola-1
Papuan
S_Papuan-5
0
0.77101


175
S_Mongola-1
French
S_French-2
0
0.77082


176
S_Mongola-1
Georgian
S_Georgian-1
0
0.77071


177
S_Mongola-1
Balochi
S_Balochi-1
0
0.77062


178
S_Mongola-1
Spanish
S_Spanish-1
0
0.77061


179
S_Mongola-1
Armenian
S_Armenian-1
0
0.77054


180
S_Mongola-1
Papuan
S_Papuan-6
0
0.77049


181
S_Mongola-1
Bergamo
S_Bergamo-2
0
0.77017


182
S_Mongola-1
Papuan
S_Papuan-2
0
0.77008


183
S_Mongola-1
Bulgarian
S_Bulgarian-2
0
0.77007


184
S_Mongola-1
Papuan
S_Papuan-4
0
0.77005


185
S_Mongola-1
Spanish
S_Spanish-2
0
0.76981


186
S_Mongola-1
Greek
S_Greek-2
0
0.76981


187
S_Mongola-1
Basque
S_Basque-1
0
0.76979


188
S_Mongola-1
English
S_English-1
0
0.76977


189
S_Mongola-1
Lezgin
S_Lezgin-2
0
0.76975


190
S_Mongola-1
Tuscan
S_Tuscan-2
0
0.76960


191
S_Mongola-1
Albanian.DG
S_Albanian1
0
0.76953


192
S_Mongola-1
English
S_English-2
0
0.76951


193
S_Mongola-1
Armenian
S_Armenian-2
0
0.76950


194
S_Mongola-1
Sardinian
S_Sardinian-2
0
0.76946


195
S_Mongola-1
Orcadian
S_Orcadian-1
0
0.76909


196
S_Mongola-1
Tuscan
S_Tuscan-1
0
0.76906


197
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-1
0
0.76901


198
S_Mongola-1
Basque
S_Basque-2
0
0.76888


199
S_Mongola-1
Georgian
S_Georgian-2
0
0.76886


200
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-2
0
0.76865


201
S_Mongola-1
Jordanian
S_Jordanian-3
0
0.76809


202
S_Mongola-1
French
S_French-1
0
0.76796


203
S_Mongola-1
BedouinB
S_BedouinB-2
0
0.76779


204
S_Mongola-1
Druze
S_Druze-1
0
0.76757


205
S_Mongola-1
Druze
S_Druze-2
0
0.76754


206
S_Mongola-1
Makrani
S_Makrani-2
0
0.76747


207
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-2
0
0.76622


208
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-1
0
0.76575


209
S_Mongola-1
Sardinian
S_Sardinian-1
0
0.76564


210
S_Mongola-1
BedouinB
S_BedouinB-1
0
0.76460


211
S_Mongola-1
Jordanian
S_Jordanian-2
0
0.76413


212
S_Mongola-1
Samaritan
S_Samaritan-1
0
0.76396


213
S_Mongola-1
Jordanian
S_Jordanian-1
0
0.76261


214
S_Mongola-1
Saharawi
S_Saharawi-2
0
0.75981


215
S_Mongola-1
Saharawi
S_Saharawi-1
0
0.75964


216
S_Mongola-1
Mozabite
S_Mozabite-1
0
0.75937


217
S_Mongola-1
Mozabite
S_Mozabite-2
0
0.75824


222
S_Mongola-1
Somali
S_Somali-1
0
0.74788


224
S_Mongola-1
Masai
S_Masai-2
0
0.74381


226
S_Mongola-1
Masai
S_Masai-1
0
0.74274


232
S_Mongola-1
Gambian
S_Gambian-2
0
0.73200


233
S_Mongola-1
BantuKenya
S_BantuKenya-1
0
0.73139


234
S_Mongola-1
Luo
S_Luo-2
0
0.73107


235
S_Mongola-1
BantuKenya
S_BantuKenya-2
0
0.73020


236
S_Mongola-1
Luhya
S_Luhya-1
0
0.73005


237
S_Mongola-1
Luhya
S_Luhya-2
0
0.73002


238
S_Mongola-1
Mandenka
S_Mandenka-2
0
0.72934


239
S_Mongola-1
Gambian
S_Gambian-1
0
0.72933


240
S_Mongola-1
Esan
S_Esan-2
0
0.72920


241
S_Mongola-1
Yoruba
S_Yoruba-2
0
0.72879


242
S_Mongola-1
Mandenka
S_Mandenka-1
0
0.72872


243
S_Mongola-1
Yoruba
S_Yoruba-1
0
0.72816


244
S_Mongola-1
Esan
S_Esan-1
0
0.72810


245
S_Mongola-1
Mende
S_Mende-1
0
0.72793


246
S_Mongola-1
Mende
S_Mende-2
0
0.72788


247
S_Mongola-1
Biaka
S_Biaka-1
0
0.72484


248
S_Mongola-1
Biaka
S_Biaka-2
0
0.72347


249
S_Mongola-1
Mbuti
S_Mbuti-3
0
0.72046


250
S_Mongola-1
Mbuti
S_Mbuti-1
0
0.72010


251
S_Mongola-1
Mbuti
S_Mbuti-2
0
0.72005


252
S_Mongola-1
Khomani_San
S_Khomani_San-2
0
0.71521


253
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-2
0
0.71514


254
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-3
0
0.71460


255
S_Mongola-1
Khomani_San
S_Khomani_San-1
0
0.71302

</tbody>

Very interesting. How much East Eurasian ancestry for the Saamis can we infer from this IBS comparison?

Zanzibar
03-09-2021, 02:11 PM
I tried doing qpAdm models of the population named Saami.DG in the v44.3_HO dataset. I excluded models with one or more negative weight (where feasible is false) and I sorted the models by their p score.

I'm probably doing something wrong, and I still don't know how to pick the outgroups. I mostly just picked outgroups that resulted in little decrease in the number of SNPs that remained after filtering. I also tried to pick left populations that resulted in little decrease in the SNP count.

I got 374794 out of 597573 SNPs after filtering, out of which 349558 were polymorphic.

https://i.ibb.co/vcTCKNz/b.png

In the image above, the models whose p score is above .05 have a constant of about 30-35% Nganasan ancestry. However EHG and CHG and SHG are also part Mongoloid. So if we consider Nganasan to be fully Mongoloid, Saami might also be closer to 40% than 30% Mongoloid.

Both individuals in the population Saami.DG were from Utsjoki, which is part of the Northern Saami region within Finland:


$ awk 'NR==1||/Saami...DG/' g/v44.3_HO_public/v44.3_HO_public.anno|cut -f2,4,9,10|tr \\t \;
Version ID;Publication (or OK to use in a paper);Locality;Country
S_Saami-1.DG;MallickNature2016;Utsjoki;Finland
S_Saami-2.DG;MallickNature2016;Utsjoki;Finland

Among Finnish Saami, there are an estimated 2,000 speakers of Northern Saami, 300 speakers of Inari Saami, and 300 speakers of Skolt Saami (https://fi.wikipedia.org/wiki/Saamelaiskielet). Out of four groups of Saami measured by Karin Mark, Skolt Saami had the lightest pigmentation, followed by Inari Saami, Finnish Northern Saami, and Kola Saami (https://www.etis.ee/Portal/Publications/Display/1fd319c0-7408-4e31-9f18-b9b3010eabad).

Scandinavian Northern Saami might be even more Mongoloid than Finnish Northern Saami, or at least Coon wrote that the Saami of the Scandinavian inland were the darkest and most brachycephalic (https://www.theapricity.com/snpa/chapter-IX2.htm):


The selected "pure" groups, Bryn's Reindeer Lapps, and some of Geyer's mountain and forest Lapps from Sweden, have seventy per cent or over of this dark hair, while the fairest Lapps, with a majority of brown and blond shades, are found in Finland and in the Kola Peninsula.

Pure dark eyes are found among one-third of Reindeer Lapps, and among as few as eight per cent in the total of Lapps from Norway.[14] Pure light and light-mixed eyes are commonest among the Lapps of Finland, where they total between thirty and forty per cent, and least common among the Reindeer Lapps of interior Norway and Sweden. Even among the purest selected sub-groups, such as that of Geyer, who isolated from a larger Swedish Lapp sample a few individuals of most pronounced Lappish type, at least a third are light or light-mixed in iris color. [...]

There are, however, regional differences; the center of extreme round headedness lies among the inland groups in northern Norway, while the Swedish, Finnish, and Kola Peninsula Lapps become progressively narrower headed. The mean for the purest Reindeer Lapps of Norway is 87; for the easternmost Lapps, 80 to 83.

Code for ADMIXTOOLS 2:


target="Saami.DG"
left=c("Turkey_Boncuklu_N.SG","Armenia_Caucasus_KuraAraxes","Latvia_HG","Sweden_Motala_HG","Russia_HG_Karelia","Russia_HG_Tyumen","Nganasan")
right=c("Mbuti.DG","Mixe.DG","Ami.DG","Papuan.DG","Chimp.REF","Ju_hoan_North","Biaka.DG","Yoruba.DG","Altai_Neanderthal.DG")

pops=c(left,right,target)

unlink("/tmp/f2",recursive=T)
extract_f2(pref="g/v44.3_HO_public/v44.3_HO_public",pops=pops,outdir="/tmp/f2")
f2=f2_from_precomp("/tmp/f2")
qp=qpadm(f2,left=left,right=right,target=target)

qp2=qp$popdrop%>%dplyr::filter(feasible==T&f4rank!=0)%>%arrange(desc(p))%>%dplyr::select(!c(wt,dof,chisq,f4rank,dofdiff,chis qdiff,p_nested,feasible,best,dofdiff,chisqdiff,p_n ested))
write_csv(qp2,"/tmp/qp")

Code to generate the bar chart:


library(tidyverse)
library(reshape2)
library(colorspace)

t=read_csv("/tmp/qp")

# t=t[t$p>.05,]

pvalue=sub("^0","",sprintf("%.3f",t$p))
t=t[-2]
t2=melt(t,id.var="pat")

ggplot(t2,aes(x=fct_rev(factor(pat,level=t$pat)),y =value,fill=variable))+
geom_bar(stat="identity",width=1,position=position_fill(reverse=T))+
geom_text(aes(label=round(100*value)),position=pos ition_stack(vjust=.5,reverse=T),size=3.5)+
coord_flip()+
theme(
axis.text.x=element_blank(),
axis.text=element_text(color="black"),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
legend.box.just="center",
legend.box.margin=margin(0),
legend.box.spacing=unit(.05,"in"),
legend.direction="vertical",
legend.justification="center",
legend.margin=margin(0),
legend.text=element_text(size=12),
legend.title=element_blank(),
panel.border=element_blank(),
text=element_text(size=16)
)+
xlab("")+
scale_x_discrete(labels=rev(pvalue),expand=c(0,0)) +
scale_y_discrete(expand=c(0,0))+
scale_fill_manual("legend",values=hex(HSV(c(45,45,210,210,120,120,300),c(.6, .6,.6,.6,.6,.6,.6),c(1,.6,1,.6,1,.6,1))))
ggsave("/tmp/a.png",width=7,height=7)

Very interesting is Saami.DG the same as the Saami samples in G25? Can you try to model the Saami and Mari and see how much EEF they have now using qpAdm if you can?

Zoro
03-09-2021, 02:20 PM
<tbody>
NO
FID1
FID2
IID2
PI_HAT
IBS


1
S_Mongola-1
Korean
S_Korean-1
0.157
0.81068


2
S_Mongola-1
Han
S_Han-1
0.1538
0.81020


3
S_Mongola-1
Japanese
S_Japanese-1
0.1603
0.80999


4
S_Mongola-1
Xibo
S_Xibo-2
0.1463
0.80968


5
S_Mongola-1
Korean
S_Korean-2
0.1546
0.80955


6
S_Mongola-1
Han
S_Han-2
0.1562
0.80910


7
S_Mongola-1
Tujia
S_Tujia-2
0.1522
0.80896


8
S_Mongola-1
Japanese
S_Japanese-2
0.148
0.80880


9
S_Mongola-1
She
S_She-1
0.1542
0.80875


10
S_Mongola-1
She
S_She-2
0.1535
0.80870


11
S_Mongola-1
Naxi
S_Naxi-1
0.1527
0.80869


12
S_Mongola-1
Japanese
S_Japanese-3
0.1426
0.80865


13
S_Mongola-1
Hezhen
S_Hezhen-2
0.1438
0.80863


14
S_Mongola-1
Yi
S_Yi-1
0.1494
0.80853


15
S_Mongola-1
Xibo
S_Xibo-1
0.1408
0.80837


16
S_Mongola-1
Miao
S_Miao-2
0.1534
0.80827


17
S_Mongola-1
Kinh
S_Kinh-1
0.1488
0.80800


18
S_Mongola-1
Naxi
S_Naxi-3
0.1516
0.80795


19
S_Mongola-1
Hezhen
S_Hezhen-1
0.1514
0.80782


20
S_Mongola-1
Tujia
S_Tujia-1
0.1519
0.80772


21
S_Mongola-1
Mongola
S_Mongola-2
0.1456
0.80755


22
S_Mongola-1
Miao
S_Miao-1
0.1518
0.80748


23
S_Mongola-1
Ulchi
S_Ulchi-1
0.1642
0.80746


24
S_Mongola-1
Oroqen
S_Oroqen-1
0.1575
0.80745


25
S_Mongola-1
Yi
S_Yi-2
0.1529
0.80724


26
S_Mongola-1
Daur
S_Daur-2
0.1422
0.80716


27
S_Mongola-1
Ulchi
S_Ulchi-2
0.1566
0.80713


28
S_Mongola-1
Oroqen
S_Oroqen-2
0.1588
0.80693


29
S_Mongola-1
Dai
S_Dai-1
0.1463
0.80672


30
S_Mongola-1
Even
S_Even-3
0.1583
0.80661


31
S_Mongola-1
Dai
S_Dai-2
0.1519
0.80603


32
S_Mongola-1
Tu
S_Tu-2
0.1387
0.80580


33
S_Mongola-1
Kinh
S_Kinh-2
0.1415
0.80574


34
S_Mongola-1
Thai
S_Thai-2
0.1401
0.80573


35
S_Mongola-1
China_Lahu
S_Lahu-1
0.1524
0.80558


36
S_Mongola-1
Burmese
S_Burmese-1
0.1385
0.80540


37
S_Mongola-1
Tu
S_Tu-1
0.1354
0.80530


38
S_Mongola-1
Ami.DG
S_Ami1
0.1575
0.80503


39
S_Mongola-1
Ami.DG
S_Ami2
0.1595
0.80502


40
S_Mongola-1
Even
S_Even-2
0.1555
0.80488


41
S_Mongola-1
Yakut
S_Yakut-1
0.1485
0.80419


42
S_Mongola-1
China_Lahu
S_Lahu-2
0.1523
0.80397


43
S_Mongola-1
Igorot
S_Igorot-2
0
0.80313


44
S_Mongola-1
Dusun
S_Dusun-2
0
0.80309


45
S_Mongola-1
Dusun
S_Dusun-1
0
0.80308


46
S_Mongola-1
Thai
S_Thai-1
0.1275
0.80306


47
S_Mongola-1
Igorot
S_Igorot-1
0
0.80301


48
S_Mongola-1
Cambodian
S_Cambodian-1
0.1407
0.80241


49
S_Mongola-1
Even
S_Even-1
0.1214
0.80214


50
S_Mongola-1
Burmese
S_Burmese-2
0.1169
0.80213


51
S_Mongola-1
Yakut
S_Yakut-2
0.1438
0.80209


52
S_Mongola-1
Cambodian
S_Cambodian-2
0.134
0.80188


53
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki1
0
0.80124


54
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-1
0.1127
0.79908


55
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-2
0.1005
0.79815


56
S_Mongola-1
Itelmen
S_Itelman-1
0
0.79809


57
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan2
0
0.79789


58
S_Mongola-1
Eskimo_Chaplin.DG
S_Chaplin1
0
0.79770


59
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan1
0
0.79751


60
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki2
0
0.79749


61
S_Mongola-1
Kusunda
S_Kusunda-1
0.1132
0.79740


62
S_Mongola-1
Tubalar
S_Tubalar-2
0
0.79509


63
S_Mongola-1
Tubalar
S_Tubalar-1
0.1107
0.79490


64
S_Mongola-1
Chukchi
S_Chukchi-1
0.0841
0.79357


65
S_Mongola-1
Uyghur
S_Uygur-1
0.0898
0.79336


66
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec1
0
0.79282


67
S_Mongola-1
Mansi
S_Mansi-1
0
0.79238


68
S_Mongola-1
Hazara
S_Hazara-1
0
0.79204


69
S_Mongola-1
Pima
S_Pima-1
0
0.79198


70
S_Mongola-1
Uyghur
S_Uygur-2
0
0.79197


71
S_Mongola-1
Hazara
S_Hazara-2
0
0.79170


72
S_Mongola-1
Mayan
S_Mayan-2
0
0.79120


73
S_Mongola-1
Mixtec
S_Mixtec-1
0
0.79120


74
S_Mongola-1
Mixe
S_Mixe-2
0
0.79115


75
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec2
0
0.79101


76
S_Mongola-1
Mayan
S_Mayan-1
0
0.79087


77
S_Mongola-1
Quechua
S_Quechua-3
0
0.79075


78
S_Mongola-1
Mixe
S_Mixe-3
0
0.79044


79
S_Mongola-1
Piapoco
S_Piapoco-2
0
0.79029


80
S_Mongola-1
Quechua
S_Quechua-1
0
0.79023


81
S_Mongola-1
Quechua
S_Quechua-2
0
0.78995


82
S_Mongola-1
Pima
S_Pima-2
0
0.78978


83
S_Mongola-1
Mansi
S_Mansi-2
0
0.78962


84
S_Mongola-1
Khonda_Dora
S_Khonda_Dora-1
0
0.78847


85
S_Mongola-1
Tlingit
S_Tlingit-2
0
0.78816


86
S_Mongola-1
Mixtec
S_Mixtec-2
0
0.78811


87
S_Mongola-1
Maori
S_Maori-1
0.0542
0.78805


88
S_Mongola-1
Piapoco
S_Piapoco-1
0
0.78747


89
S_Mongola-1
Karitiana
S_Karitiana-2
0
0.78742


90
S_Mongola-1
Surui
S_Surui-1
0
0.78727


91
S_Mongola-1
Surui
S_Surui-2
0
0.78565


92
S_Mongola-1
Karitiana
S_Karitiana-1
0
0.78561


93
S_Mongola-1
Bengali
S_Bengali-1
0
0.78436


94
S_Mongola-1
Kusunda
S_Kusunda-2
0
0.78408


95
S_Mongola-1
Tlingit
S_Tlingit-1
0
0.78388


96
S_Mongola-1
Relli
S_Relli-1
0
0.78344


97
S_Mongola-1
Kapu
S_Kapu-2
0
0.78280


98
S_Mongola-1
Madiga
S_Madiga-1
0
0.78227


99
S_Mongola-1
Madiga
S_Madiga-2
0
0.78175


100
S_Mongola-1
Mala
S_Mala-3
0
0.78161


101
S_Mongola-1
Yadava
S_Yadava-1
0
0.78157


102
S_Mongola-1
Bengali
S_Bengali-2
0
0.78140


103
S_Mongola-1
Kapu
S_Kapu-1
0
0.78130


104
S_Mongola-1
Irula
S_Irula-2
0
0.78128


105
S_Mongola-1
Mala
S_Mala-2
0
0.78128


106
S_Mongola-1
Punjabi
S_Punjabi-1
0
0.78107


107
S_Mongola-1
Irula
S_Irula-1
0
0.78107


108
S_Mongola-1
Burusho
S_Burusho-2
0
0.78081


109
S_Mongola-1
Yadava
S_Yadava-2
0
0.78078


110

S_Mongola-1
Saami
S_Saami-1
0
0.78063


111
S_Mongola-1
Brahmin
S_Brahmin-2
0
0.78031


112

S_Mongola-1
Saami
S_Saami-2
0
0.78012


113
S_Mongola-1
Relli
S_Relli-2
0
0.77974


114
S_Mongola-1
Punjabi
S_Punjabi-3
0
0.77920


115
S_Mongola-1
Bougainville
S_Bougainville-1
0
0.77900


116
S_Mongola-1
Burusho
S_Burusho-1
0
0.77885


117
S_Mongola-1
Punjabi
S_Punjabi-2
0
0.77885


118
S_Mongola-1
Brahmin
S_Brahmin-1
0
0.77874


119
S_Mongola-1
Bougainville
S_Bougainville-2
0
0.77866


120
S_Mongola-1
Sindhi
S_Sindhi-2
0
0.77851


121
S_Mongola-1
Pathan
S_Pathan-1
0
0.77838


122
S_Mongola-1
Punjabi
S_Punjabi-4
0
0.77776


123

S_Mongola-1
Kurd-Iraq
WGS
0
0.77625


124
S_Mongola-1
Pathan
S_Pathan-2
0
0.77597


125
S_Mongola-1
Ossetian-North
S_Ossetian-1
0
0.77575


126
S_Mongola-1
Russian
S_Russian-1
0
0.77570


127

S_Mongola-1
Finnish
S_Finnish-1
0
0.77476


128
S_Mongola-1
Sindhi
S_Sindhi-1
0
0.77473


129
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-1
0
0.77463


130
S_Mongola-1
Tajik
S_Tajik-2
0
0.77448


131
S_Mongola-1
YANA_UP_WGS
Yana1
0
0.77422


132
S_Mongola-1
Ossetian-North
S_Ossetian-2
0
0.77413


133
S_Mongola-1
Papuan
S_Papuan-10
0
0.77381


134
S_Mongola-1
Balochi
S_Balochi-2
0
0.77365


135
S_Mongola-1
Brahui
S_Brahui-1
0
0.77363


136
S_Mongola-1
Adygei
S_Adygei-1
0
0.77334


137
S_Mongola-1
Makrani
S_Makrani-1
0
0.77334


138
S_Mongola-1
Finnish
S_Finnish-3
0
0.77319


139
S_Mongola-1
Adygei
S_Adygei-2
0
0.77319


140
S_Mongola-1
Kalash
S_Kalash-2
0
0.77319


141
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-2
0
0.77319


142
S_Mongola-1
Chechen
S_Chechen-1
0
0.77312


143
S_Mongola-1
Papuan
S_Papuan-9
0
0.77307


144
S_Mongola-1
Russian
S_Russian-2
0
0.77288


145
S_Mongola-1
Icelandic
S_Icelandic-1
0
0.77260


146
S_Mongola-1
Finnish
S_Finnish-2
0
0.77258


147
S_Mongola-1
Papuan
S_Papuan-12
0
0.77257


148
S_Mongola-1
Kalash
S_Kalash-1
0
0.77247


149
S_Mongola-1
Lezgin
S_Lezgin-1
0
0.77245


150
S_Mongola-1
Papuan
S_Papuan-8
0
0.77232


151
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-1
0
0.77197


152
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-1
0
0.77194


153
S_Mongola-1
Brahui
S_Brahui-2
0
0.77178


154
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-2
0
0.77170


155
S_Mongola-1
Papuan
S_Papuan-1
0
0.77164


156
S_Mongola-1
Norwegian
S_Norwegian-1
0
0.77159


157
S_Mongola-1
Orcadian
S_Orcadian-2
0
0.77158


158
S_Mongola-1
Estonian
S_Estonian-1
0
0.77155


159
S_Mongola-1
Papuan
S_Papuan-7
0
0.77150


160
S_Mongola-1
Papuan
S_Papuan-11
0
0.77146


161
S_Mongola-1
Estonian
S_Estonian-2
0
0.77144


162
S_Mongola-1
Papuan
S_Papuan-13
0
0.77131


163
S_Mongola-1
Tajik
S_Tajik-1
0
0.77131


164
S_Mongola-1
Papuan
S_Papuan-14
0
0.77129


165
S_Mongola-1
Hungarian
S_Hungarian-2
0
0.77120


166
S_Mongola-1
Czech
S_Czech-2
0
0.77120


167
S_Mongola-1
Papuan
S_Papuan-3
0
0.77119


168
S_Mongola-1
Icelandic
S_Icelandic-2
0
0.77119


169
S_Mongola-1
Hungarian
S_Hungarian-1
0
0.77111


170
S_Mongola-1
Polish
S_Polish-1
0
0.77110


171
S_Mongola-1
Bulgarian
S_Bulgarian-1
0
0.77106


172
S_Mongola-1
Greek
S_Greek-1
0
0.77103


173
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-2
0
0.77103


174
S_Mongola-1
Papuan
S_Papuan-5
0
0.77101


175
S_Mongola-1
French
S_French-2
0
0.77082


176
S_Mongola-1
Georgian
S_Georgian-1
0
0.77071


177
S_Mongola-1
Balochi
S_Balochi-1
0
0.77062


178
S_Mongola-1
Spanish
S_Spanish-1
0
0.77061


179
S_Mongola-1
Armenian
S_Armenian-1
0
0.77054


180
S_Mongola-1
Papuan
S_Papuan-6
0
0.77049


181
S_Mongola-1
Bergamo
S_Bergamo-2
0
0.77017


182
S_Mongola-1
Papuan
S_Papuan-2
0
0.77008


183
S_Mongola-1
Bulgarian
S_Bulgarian-2
0
0.77007


184
S_Mongola-1
Papuan
S_Papuan-4
0
0.77005


185
S_Mongola-1
Spanish
S_Spanish-2
0
0.76981


186
S_Mongola-1
Greek
S_Greek-2
0
0.76981


187
S_Mongola-1
Basque
S_Basque-1
0
0.76979


188
S_Mongola-1
English
S_English-1
0
0.76977


189
S_Mongola-1
Lezgin
S_Lezgin-2
0
0.76975


190
S_Mongola-1
Tuscan
S_Tuscan-2
0
0.76960


191
S_Mongola-1
Albanian.DG
S_Albanian1
0
0.76953


192
S_Mongola-1
English
S_English-2
0
0.76951


193
S_Mongola-1
Armenian
S_Armenian-2
0
0.76950


194
S_Mongola-1
Sardinian
S_Sardinian-2
0
0.76946


195
S_Mongola-1
Orcadian
S_Orcadian-1
0
0.76909


196
S_Mongola-1
Tuscan
S_Tuscan-1
0
0.76906


197
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-1
0
0.76901


198
S_Mongola-1
Basque
S_Basque-2
0
0.76888


199
S_Mongola-1
Georgian
S_Georgian-2
0
0.76886


200
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-2
0
0.76865


201
S_Mongola-1
Jordanian
S_Jordanian-3
0
0.76809


202
S_Mongola-1
French
S_French-1
0
0.76796


203
S_Mongola-1
BedouinB
S_BedouinB-2
0
0.76779


204
S_Mongola-1
Druze
S_Druze-1
0
0.76757


205
S_Mongola-1
Druze
S_Druze-2
0
0.76754


206
S_Mongola-1
Makrani
S_Makrani-2
0
0.76747


207
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-2
0
0.76622


208
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-1
0
0.76575


209
S_Mongola-1
Sardinian
S_Sardinian-1
0
0.76564


210
S_Mongola-1
BedouinB
S_BedouinB-1
0
0.76460


211
S_Mongola-1
Jordanian
S_Jordanian-2
0
0.76413


212
S_Mongola-1
Samaritan
S_Samaritan-1
0
0.76396


213
S_Mongola-1
Jordanian
S_Jordanian-1
0
0.76261


214
S_Mongola-1
Saharawi
S_Saharawi-2
0
0.75981


215
S_Mongola-1
Saharawi
S_Saharawi-1
0
0.75964


216
S_Mongola-1
Mozabite
S_Mozabite-1
0
0.75937


217
S_Mongola-1
Mozabite
S_Mozabite-2
0
0.75824


222
S_Mongola-1
Somali
S_Somali-1
0
0.74788


224
S_Mongola-1
Masai
S_Masai-2
0
0.74381


226
S_Mongola-1
Masai
S_Masai-1
0
0.74274


232
S_Mongola-1
Gambian
S_Gambian-2
0
0.73200


233
S_Mongola-1
BantuKenya
S_BantuKenya-1
0
0.73139


234
S_Mongola-1
Luo
S_Luo-2
0
0.73107


235
S_Mongola-1
BantuKenya
S_BantuKenya-2
0
0.73020


236
S_Mongola-1
Luhya
S_Luhya-1
0
0.73005


237
S_Mongola-1
Luhya
S_Luhya-2
0
0.73002


238
S_Mongola-1
Mandenka
S_Mandenka-2
0
0.72934


239
S_Mongola-1
Gambian
S_Gambian-1
0
0.72933


240
S_Mongola-1
Esan
S_Esan-2
0
0.72920


241
S_Mongola-1
Yoruba
S_Yoruba-2
0
0.72879


242
S_Mongola-1
Mandenka
S_Mandenka-1
0
0.72872


243
S_Mongola-1
Yoruba
S_Yoruba-1
0
0.72816


244
S_Mongola-1
Esan
S_Esan-1
0
0.72810


245
S_Mongola-1
Mende
S_Mende-1
0
0.72793


246
S_Mongola-1
Mende
S_Mende-2
0
0.72788


247
S_Mongola-1
Biaka
S_Biaka-1
0
0.72484


248
S_Mongola-1
Biaka
S_Biaka-2
0
0.72347


249

S_Mongola-1
Mbuti
S_Mbuti-3
0
0.72046


250
S_Mongola-1
Mbuti
S_Mbuti-1
0
0.72010


251
S_Mongola-1
Mbuti
S_Mbuti-2
0
0.72005


252
S_Mongola-1
Khomani_San
S_Khomani_San-2
0
0.71521


253
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-2
0
0.71514


254
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-3
0
0.71460


255
S_Mongola-1
Khomani_San
S_Khomani_San-1
0
0.71302

</tbody>




It's pretty amazing that the above IBS list was able to properly order Mbuti, Khomani, and Ju-Hoan in terms of IBS with Mongola.

Does anyone know why Mongola is slightly closer to Mbuti than Khomani and Ju-Hoan ?

Hint: The late paleolithic African paper :)


PROOF G25 distances shouldn't be trusted

The late Paleolithic African paper showed that there was Eurasian geneflow back to Africa in the Paleolithic that affected pretty much all Africans including Mbuti. In other words even Mbuti got some Eurasian genes during the Paleolithic. Least affected were Khomani and Ju-Hoan.

The IBS list I posted accurately shows this by showing Mongola closer to Mbuti than to Khomani and Ju-Hoan.

The G25 (scaled) on the other hand gets it all wrong. You can try it yourself. It wrongly shows Mongola significantly closer to Khomani-San than Mbuti ! If it gets this wrong then how should the pops be trusted.

Distance to: Mongola
0.918673 Khomani_San
0.98425066 Ju_hoan_North
0.99607508 Mbuti

Zoro
03-09-2021, 03:17 PM
From the African paper showing that the IBS list is correct and G25 is wrong. BTW I just checked distances from Mongola to Kurds vs Chechens vs Balochis vs Iranians are also screwed up in G25


In addition, we find that the Mbuti and Biaka, both Central African hunter-gatherer populations, show levels
of Eurasian gene flow that are intermediate between levels observed in the Khoe-San and Yorubans (Fig. 1a,b,
Supplemental Table S1)

https://i.imgur.com/VvE3LBX.jpg

Mbuti has more Eurasian admixture than Khomani and Ju-Hoan

https://i.imgur.com/NtvoN9W.jpg

https://i.imgur.com/OPF1NxR.jpg

https://i.imgur.com/ZaUlzUP.jpg

Leto
03-09-2021, 05:49 PM
I wonder why the Mari have like 3-3.5 times more East Eurasian than the Mordovians (~10% vs ~30%). Those two republics are not even that far away from each other.

Komintasavalta
03-09-2021, 09:20 PM
PROOF G25 distances shouldn't be trusted

The late Paleolithic African paper showed that there was Eurasian geneflow back to Africa in the Paleolithic that affected pretty much all Africans including Mbuti. In other words even Mbuti got some Eurasian genes during the Paleolithic. Least affected were Khomani and Ju-Hoan.

The IBS list I posted accurately shows this by showing Mongola closer to Mbuti than to Khomani and Ju-Hoan.

The G25 (scaled) on the other hand gets it all wrong. You can try it yourself. It wrongly shows Mongola significantly closer to Khomani-San than Mbuti ! If it gets this wrong then how should the pops be trusted.

Distance to: Mongola
0.918673 Khomani_San
0.98425066 Ju_hoan_North
0.99607508 Mbuti

Based on FST distances from 1240K, Mongola were also further from San than from Mbuti:

> fst=fst("g/v44.3_1240K_public/v44.3_1240K_public",c("Biaka.DG","Ju_hoan_North.DG","Khomani_San.DG","Mbuti.DG","Mongola.DG"))
> f2m=function(x){t=as.data.frame(x)[,1:3];t2=rbind(t,setNames(t[,c(2,1,3)],names(t)));xtabs(t2[,3]~t2[,2]+t2[,1])}
> r=sort(f2m(fst)["Mongola.DG",]);cat(paste(sprintf("%.4f",r),names(r)),sep="\n")
0.0000 Mongola.DG
0.2014 Biaka.DG
0.2309 Mbuti.DG
0.2482 Ju_hoan_North.DG
0.2571 Khomani_San.DG

The f2m function converts f2 or FST pairs to a square matrix.

However based on unscaled G25 distances, Mongola are closer to Mbuti than to Ju_hoan_North:

$ mkdir -p g/25;printf %s\\n ai\ 1UrhcfNMLW0oMXIbHGUE60v2taCM7PFw1 aa\ 1F2rKEVtu8nWSm7qFhxPU6UESQNsmA-sl mi\ 1HYrDwxEXv82DvDLoq736pS5ZTGJA4dn5 ma\ 1wZr-UOve0KUKo_Qbgeo27m-CQncZWb8y aiu\ 1YKkEOtyV5SISvmY_FyS4YSLXCxxYt5_W aau\ 1f0imQyVNZ9RPESNAYIeIkA8fx4wAVNYo miu\ 18GcEVEl3GI-ByviD-TgQQjvEaaTbNTr2 mau\ 1y49hyvviJpHj9esVqyeiFm32DhnPlfRQ|while read l m;do curl "drive.google.com/uc?export=download&id=$m" -Lso g/25/$l;done
$ dist(){ awk -F, 'NR==FNR{for(i=2;i<=NF;i++)a[i]=$i;next}$1{s=0;for(i=2;i<=NF;i++)s+=($i-a[i])^2;print s^.5,$1}' "$2" "$1"|sort -n|awk '{printf"%.3f %s\n",$1,$2}'|sed s,^0,,;}
$ dist g/25/mau <(grep Mongola g/25/mau)|tail
.225 Piapoco
.228 Wichi
.234 Kosipe
.236 Karitiana
.241 Koinanbe
.241 Papuan
.241 Surui
.274 Khomani_San
.281 Mbuti
.388 Ju_hoan_North
$ dist g/25/ma <(grep Mongola g/25/ma)|tail
.831 Igbo
.831 Yoruba
.835 Esan_Nigeria
.859 Bedzan
.864 Bakola
.867 Baka
.881 Biaka
.919 Khomani_San
.984 Ju_hoan_North
.996 Mbuti

Komintasavalta
03-09-2021, 09:45 PM
11- Drop Nganasan from sources and add Shamanka-EN instead. It's always better to keep it cosistently Ancients. Shamanka would be more ancestral to Uralics than Nganasan.

A qpAdm model in Jeong et al. 2019 ("The genetic history of admixture across inner Eurasia") used Nganasan as a source to model Uralics: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6542712/figure/F5/.


7- I would add Iran-N to pright

When I added it, KuraAraxes became the main component in my models for Saami.DG:

https://i.ibb.co/WH9M5Nj/saami.png

But I guess if I don't go with the strategy where I just pick niggers and chimps as outgroups, then I'm supposed to select an outgroup related to each left population.

Anyway, I tried implementing your suggestions I used these outgroups: Belgium_UP_GoyetQ116_1_published_all, Iran_GanjDareh_N, Morocco_Iberomaurusian, Russia_HG_Tyumen, Russia_Kolyma_M.SG, Russia_Kostenki14, Switzerland_Bichon.SG. I don't know how to pick just a single individual from a population as an outgroup.

I now got "597573 SNPs read in total" with "244600 SNPs remain after filtering. 190241 are polymorphic."

The proportion of the Mongoloid ancestry dropped, but maybe I should've used kra001 instead of Shamanka and Devil's Cave as a Mongoloid source.

https://i.ibb.co/WWnM3WT/zoro.png

On the last two rows of the image above, Saami are modeled as 48% Latvia_HG and 52% Shamanka/DevilsCave, which doesn't seem right. When I added two additional outgroups that were related to Latvia_HG (Norway_N_HG.SG) and to Devil's Cave (Russia_MN_Boisman), the Mongoloid ancestry dropped to about 30-35% in the two-way models of Latvia_HG/Motala + Shamanka/DevilsCave. It increased the p values of the first models, but it also reduced their proportion of Mongoloid ancestry.

https://i.ibb.co/7VmtKQS/zoro2.png

BTW I didn't realize earlier that the anno files have a column for SNP counts:

$ cut -f2,13,21 g/v44.3_1240K_public/v44.3_1240K_public.anno|awk 'NR==1||/Saami/'|tr \\t \;
Version ID;GroupID;SNPs hit on autosomal targets
S_Saami-1.DG;Saami.DG;1119750
S_Saami-2.DG;Saami.DG;1120268
Saami.SG;Finland_Saami_Modern.SG;1128484
DA237.SG;Finland_Saami_IA.SG;110265
$ cut -f2,8,15 g/v44.3_HO_public/v44.3_HO_public.anno|awk 'NR==1||/Saami/'|tr \\t \;
Version ID;Group Label;SNPs hit on autosomal targets
SD60_297;Saami.WGA;588596
S_Saami-1.DG;Saami.DG;584416
S_Saami-2.DG;Saami.DG;584732
Saami.SG;Finland_Saami_Modern.SG;569246
DA237.SG;Finland_Saami_IA.SG;59111

travv
03-09-2021, 10:22 PM
I wonder why the Mari have like 3-3.5 times more East Eurasian than the Mordovians (~10% vs ~30%). Those two republics are not even that far away from each other.

Because Mordovians are semi-wogs like Slavs, Balts, Hungarians, Germanics and other Europeans.

I doubt however that Mordovians has 10% East Eurasian. Too high for them.

Komintasavalta
03-09-2021, 11:53 PM
I wonder why the Mari have like 3-3.5 times more East Eurasian than the Mordovians (~10% vs ~30%). Those two republics are not even that far away from each other.

Even a few degrees of latitude matters.

Based on YHG, Mordvins cluster together with Tatars and Central and Southern Russians, and they have little N by Finno-Permic standards.

https://i.ibb.co/fYj0CSQ/tambets-yhg.png


library(pheatmap)
library(tidyverse)
library(colorspace) # for hex
library(vegan) # for reorder.hclust

download.file("https://pastebin.com/raw/jFpVY4Wv","tambetsyhg")

t=read.csv("tambetsyhg",header=T,row.names=1,check.names=F)

pop=c("Bashkirs","Chuvashes","Enets","Estonians","Finns","Hungarians","Karelians","Khanty","Komis","Latvians","Lithuanians","Mansis","Maris","Mordovians","Nenets","Nganasans","Russians Central","Russians North","Russians South","Saami from Kola Peninsula","Saami from Sweden","Selkups","Swedes","Tatars","Udmurts","Vepsians")

t=t[pop,]
t=t%>%select_if(colSums(.)>=4)

weight=rowSums(t[,c("N(xN3)1# (M231)","N32# (TAT/M178)","C3 (M217)","P+Q+R*+R2 (M74/M242/M207/M124)")])

sort=reorder(hclust(dist(t)),wts=weight)

pheatmap(
t,
filename="output.png",
clustering_callback=function(...){c(sort)},
cluster_cols=F,
legend=F,
treeheight_row=80,
cellwidth=16,
cellheight=16,
fontsize=8,
border_color=NA,
display_numbers=T,
number_format="%.0f",
fontsize_number=7,
number_color="black",
colorRampPalette(hex(HSV(c(210,210,160,120,60,40,2 0,0,0),c(0,.5,.5,.5,.5,.5,.5,.5,.5),c(1,1,1,1,1,1, 1,1,.7))))(256)
)

Zanzibar
03-10-2021, 01:47 AM
A qpAdm model in Jeong et al. 2019 ("The genetic history of admixture across inner Eurasia") used Nganasan as a source to model Uralics: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6542712/figure/F5/.



When I added it, KuraAraxes became the main component in my models for Saami.DG:

https://i.ibb.co/WH9M5Nj/saami.png

But I guess if I don't go with the strategy where I just pick niggers and chimps as outgroups, then I'm supposed to select an outgroup related to each left population.

Anyway, I tried implementing your suggestions I used these outgroups: Belgium_UP_GoyetQ116_1_published_all, Iran_GanjDareh_N, Morocco_Iberomaurusian, Russia_HG_Tyumen, Russia_Kolyma_M.SG, Russia_Kostenki14, Switzerland_Bichon.SG. I don't know how to pick just a single individual from a population as an outgroup.

I now got "597573 SNPs read in total" with "244600 SNPs remain after filtering. 190241 are polymorphic."

The proportion of the Mongoloid ancestry dropped, but maybe I should've used kra001 instead of Shamanka and Devil's Cave as a Mongoloid source.

https://i.ibb.co/WWnM3WT/zoro.png

On the last two rows of the image above, Saami are modeled as 48% Latvia_HG and 52% Shamanka/DevilsCave, which doesn't seem right. When I added two additional outgroups that were related to Latvia_HG (Norway_N_HG.SG) and to Devil's Cave (Russia_MN_Boisman), the Mongoloid ancestry dropped to about 30-35% in the two-way models of Latvia_HG/Motala + Shamanka/DevilsCave. It increased the p values of the first models, but it also reduced their proportion of Mongoloid ancestry.

https://i.ibb.co/7VmtKQS/zoro2.png

BTW I didn't realize earlier that the anno files have a column for SNP counts:

$ cut -f2,13,21 g/v44.3_1240K_public/v44.3_1240K_public.anno|awk 'NR==1||/Saami/'|tr \\t \;
Version ID;GroupID;SNPs hit on autosomal targets
S_Saami-1.DG;Saami.DG;1119750
S_Saami-2.DG;Saami.DG;1120268
Saami.SG;Finland_Saami_Modern.SG;1128484
DA237.SG;Finland_Saami_IA.SG;110265
$ cut -f2,8,15 g/v44.3_HO_public/v44.3_HO_public.anno|awk 'NR==1||/Saami/'|tr \\t \;
Version ID;Group Label;SNPs hit on autosomal targets
SD60_297;Saami.WGA;588596
S_Saami-1.DG;Saami.DG;584416
S_Saami-2.DG;Saami.DG;584732
Saami.SG;Finland_Saami_Modern.SG;569246
DA237.SG;Finland_Saami_IA.SG;59111

Are these all individual samples? Several of them seem.to be majority EEF/Boncuklu-derived with slightly lower Mongoloid ancestry while.others completely lack the Anatolian contamination.

Yep you should also included kra001.You should also replaced Armenia_Kura_Araxes with Georgia_CHG and Iran_Wezmeh_N because Kura_Araxes also contains Anatolian wog admix which hide the actual amount of EEF wog impurity in Saamis. Also try replacing Boncuklu_N with Barcin_N to see if the EEF contamination level.will decrease.

Mingle
03-10-2021, 03:55 AM
PROOF G25 distances shouldn't be trusted

The late Paleolithic African paper showed that there was Eurasian geneflow back to Africa in the Paleolithic that affected pretty much all Africans including Mbuti. In other words even Mbuti got some Eurasian genes during the Paleolithic. Least affected were Khomani and Ju-Hoan.

The IBS list I posted accurately shows this by showing Mongola closer to Mbuti than to Khomani and Ju-Hoan.

The G25 (scaled) on the other hand gets it all wrong. You can try it yourself. It wrongly shows Mongola significantly closer to Khomani-San than Mbuti ! If it gets this wrong then how should the pops be trusted.

Distance to: Mongola
0.918673 Khomani_San
0.98425066 Ju_hoan_North
0.99607508 Mbuti

Can you link the paper?

Komintasavalta
03-10-2021, 08:09 AM
Are these all individual samples?

No, they're alternative models for a Saami population average that consists of two Northern Saami individuals from Finland. The models use different combinations of source populations and they are sorted by the p-score (which indicates how feasible the models are). I excluded models where one or more source population had a negative weight.

Actually you're probably not supposed to use qpAdm the way I did, but you're supposed to pick different outgroups for each model using something like the `qpadm_rotate` function (https://uqrmaie1.github.io/admixtools/articles/admixtools.html):


`qpadm_rotate()` tests many `qpadm()` models at a time. For each model, the `leftright` populations will be split into two groups: The first group will be the left populations passed to `qpadm()`, while the second group will be added to `rightfix` and become the set of right populations.

I used KuraAraxes because Georgia_Kotias.SG / KK1.SG (GEO_CHG on G25) gave me a lower SNP count. Vahaduo often gave me KuraAraxes as a churka source for Uralic models.

Anyway, I'm learning ADMIXTURE now. It seems easier than qpAdm because you don't have to pick outgroups.

I downloaded the tar file from here: https://www.gnxp.com/WordPress/2018/07/11/tutorial-to-run-pca-admixture-treemix-and-pairwise-fst-in-one-command/. I downloaded Mac binaries for ADMIXTURE and plink: http://dalexander.github.io/admixture/download.html, https://www.cog-genomics.org/plink/1.9/.

I picked populations from this list:


cut -d' ' -f1 ancestry/Est1000HGDP.fam|sort -u

Next I ran commands like this:


n=travvscale;k=3
plink --bfile ancestry/Est1000HGDP --keep <(printf %s\\n Armenians Bulgarians Ukranians Russian Komi Maris Nenet|awk 'NR==FNR{a[$0];next}$1 in a' - ancestry/Est1000HGDP.fam) --make-bed --out $n
admixture -j8 $n.bed $k
paste -d' ' <(cut -d' ' -f1,2 $n.fam) $n.$k.Q>$n.$k

Then I ran this in R:


library(tidyverse)
library(colorspace) # for hex()

t=read.table("~/travvscale.3",sep=" ",header=F)

t=t[,c(1,4,3,5)]
names(t)=seq(length(t))

ave=aggregate(t[,-1],list(t[,1]),mean)
ave=ave[order(ave[,4]-ave[,2]),]
ave2=pivot_longer(ave,cols=2:ncol(ave))

ggplot(ave2,aes(x=fct_rev(factor(Group.1,level=uni que(Group.1))),y=value,fill=name))+
geom_bar(stat="identity",width=1,position=position_fill(reverse=T))+
geom_text(aes(label=round(100*value)),position=pos ition_stack(vjust=.5,reverse=T),size=3.5)+
coord_flip()+
theme(axis.text=element_text(color="black"),axis.ticks=element_blank(),axis.title.x=element_ blank(),legend.position="none",text=element_text(size=16))+
xlab("")+
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))+
scale_fill_manual("legend",values=hex(HSV(c(30,210,300),c(.5),c(1))))
ggsave("output.png",width=5,height=2.5)

Result:

https://i.ibb.co/K20Rgz5/travv-scale-admixture.png

Lucas
03-10-2021, 03:45 PM
Est1000HGDP.fam

You created merged dataset or you find it somewhere?

LorenzoSpitaleri
03-10-2021, 03:48 PM
I didn't know Udmurts had such high Mongoloid ancestry considering the predominance of red hair in them

Enviado desde mi SM-A107M mediante Tapatalk

Zoro
03-10-2021, 06:06 PM
Can you link the paper?

Here is the supp

https://www.biorxiv.org/content/biorxiv/early/2020/06/01/2020.06.01.127555/DC1/embed/media-1.pdf?download=true

Here is the paper

https://www.biorxiv.org/content/10.1101/2020.06.01.127555v1.full.pdf

Zoro
03-10-2021, 06:46 PM
<tbody>
NO
FID1
FID2
IID2
PI_HAT
IBS


1
S_Mongola-1
Korean
S_Korean-1
0.157
0.81068


2
S_Mongola-1
Han
S_Han-1
0.1538
0.81020


3
S_Mongola-1
Japanese
S_Japanese-1
0.1603
0.80999


4
S_Mongola-1
Xibo
S_Xibo-2
0.1463
0.80968


5
S_Mongola-1
Korean
S_Korean-2
0.1546
0.80955


6
S_Mongola-1
Han
S_Han-2
0.1562
0.80910


7
S_Mongola-1
Tujia
S_Tujia-2
0.1522
0.80896


8
S_Mongola-1
Japanese
S_Japanese-2
0.148
0.80880


9
S_Mongola-1
She
S_She-1
0.1542
0.80875


10
S_Mongola-1
She
S_She-2
0.1535
0.80870


11
S_Mongola-1
Naxi
S_Naxi-1
0.1527
0.80869


12
S_Mongola-1
Japanese
S_Japanese-3
0.1426
0.80865


13
S_Mongola-1
Hezhen
S_Hezhen-2
0.1438
0.80863


14
S_Mongola-1
Yi
S_Yi-1
0.1494
0.80853


15
S_Mongola-1
Xibo
S_Xibo-1
0.1408
0.80837


16
S_Mongola-1
Miao
S_Miao-2
0.1534
0.80827


17
S_Mongola-1
Kinh
S_Kinh-1
0.1488
0.80800


18
S_Mongola-1
Naxi
S_Naxi-3
0.1516
0.80795


19
S_Mongola-1
Hezhen
S_Hezhen-1
0.1514
0.80782


20
S_Mongola-1
Tujia
S_Tujia-1
0.1519
0.80772


21
S_Mongola-1
Mongola
S_Mongola-2
0.1456
0.80755


22
S_Mongola-1
Miao
S_Miao-1
0.1518
0.80748


23
S_Mongola-1
Ulchi
S_Ulchi-1
0.1642
0.80746


24
S_Mongola-1
Oroqen
S_Oroqen-1
0.1575
0.80745


25
S_Mongola-1
Yi
S_Yi-2
0.1529
0.80724


26
S_Mongola-1
Daur
S_Daur-2
0.1422
0.80716


27
S_Mongola-1
Ulchi
S_Ulchi-2
0.1566
0.80713


28
S_Mongola-1
Oroqen
S_Oroqen-2
0.1588
0.80693


29
S_Mongola-1
Dai
S_Dai-1
0.1463
0.80672


30
S_Mongola-1
Even
S_Even-3
0.1583
0.80661


31
S_Mongola-1
Dai
S_Dai-2
0.1519
0.80603


32
S_Mongola-1
Tu
S_Tu-2
0.1387
0.80580


33
S_Mongola-1
Kinh
S_Kinh-2
0.1415
0.80574


34
S_Mongola-1
Thai
S_Thai-2
0.1401
0.80573


35
S_Mongola-1
China_Lahu
S_Lahu-1
0.1524
0.80558


36
S_Mongola-1
Burmese
S_Burmese-1
0.1385
0.80540


37
S_Mongola-1
Tu
S_Tu-1
0.1354
0.80530


38
S_Mongola-1
Ami.DG
S_Ami1
0.1575
0.80503


39
S_Mongola-1
Ami.DG
S_Ami2
0.1595
0.80502


40
S_Mongola-1
Even
S_Even-2
0.1555
0.80488


41
S_Mongola-1
Yakut
S_Yakut-1
0.1485
0.80419


42
S_Mongola-1
China_Lahu
S_Lahu-2
0.1523
0.80397


43
S_Mongola-1
Igorot
S_Igorot-2
0
0.80313


44
S_Mongola-1
Dusun
S_Dusun-2
0
0.80309


45
S_Mongola-1
Dusun
S_Dusun-1
0
0.80308


46
S_Mongola-1
Thai
S_Thai-1
0.1275
0.80306


47
S_Mongola-1
Igorot
S_Igorot-1
0
0.80301


48
S_Mongola-1
Cambodian
S_Cambodian-1
0.1407
0.80241


49
S_Mongola-1
Even
S_Even-1
0.1214
0.80214


50
S_Mongola-1
Burmese
S_Burmese-2
0.1169
0.80213


51
S_Mongola-1
Yakut
S_Yakut-2
0.1438
0.80209


52
S_Mongola-1
Cambodian
S_Cambodian-2
0.134
0.80188


53
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki1
0
0.80124


54
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-1
0.1127
0.79908


55
S_Mongola-1
Kyrgyz_Kyrgyzstan
S_Kyrgyz-2
0.1005
0.79815


56
S_Mongola-1
Itelmen
S_Itelman-1
0
0.79809


57
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan2
0
0.79789


58
S_Mongola-1
Eskimo_Chaplin.DG
S_Chaplin1
0
0.79770


59
S_Mongola-1
Eskimo_Naukan.DG
S_Naukan1
0
0.79751


60
S_Mongola-1
Eskimo_Sireniki.DG
S_Sireniki2
0
0.79749


61
S_Mongola-1
Kusunda
S_Kusunda-1
0.1132
0.79740


62
S_Mongola-1
Tubalar
S_Tubalar-2
0
0.79509


63
S_Mongola-1
Tubalar
S_Tubalar-1
0.1107
0.79490


64
S_Mongola-1
Chukchi
S_Chukchi-1
0.0841
0.79357


65
S_Mongola-1
Uyghur
S_Uygur-1
0.0898
0.79336


66
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec1
0
0.79282


67
S_Mongola-1
Mansi
S_Mansi-1
0
0.79238


68
S_Mongola-1
Hazara
S_Hazara-1
0
0.79204


69
S_Mongola-1
Pima
S_Pima-1
0
0.79198


70
S_Mongola-1
Uyghur
S_Uygur-2
0
0.79197


71
S_Mongola-1
Hazara
S_Hazara-2
0
0.79170


72
S_Mongola-1
Mayan
S_Mayan-2
0
0.79120


73
S_Mongola-1
Mixtec
S_Mixtec-1
0
0.79120


74
S_Mongola-1
Mixe
S_Mixe-2
0
0.79115


75
S_Mongola-1
Mexico_Zapotec.DG
S_Zapotec2
0
0.79101


76
S_Mongola-1
Mayan
S_Mayan-1
0
0.79087


77
S_Mongola-1
Quechua
S_Quechua-3
0
0.79075


78
S_Mongola-1
Mixe
S_Mixe-3
0
0.79044


79
S_Mongola-1
Piapoco
S_Piapoco-2
0
0.79029


80
S_Mongola-1
Quechua
S_Quechua-1
0
0.79023


81
S_Mongola-1
Quechua
S_Quechua-2
0
0.78995


82
S_Mongola-1
Pima
S_Pima-2
0
0.78978


83
S_Mongola-1
Mansi
S_Mansi-2
0
0.78962


84
S_Mongola-1
Khonda_Dora
S_Khonda_Dora-1
0
0.78847


85
S_Mongola-1
Tlingit
S_Tlingit-2
0
0.78816


86
S_Mongola-1
Mixtec
S_Mixtec-2
0
0.78811


87
S_Mongola-1
Maori
S_Maori-1
0.0542
0.78805


88
S_Mongola-1
Piapoco
S_Piapoco-1
0
0.78747


89
S_Mongola-1
Karitiana
S_Karitiana-2
0
0.78742


90
S_Mongola-1
Surui
S_Surui-1
0
0.78727


91
S_Mongola-1
Surui
S_Surui-2
0
0.78565


92
S_Mongola-1
Karitiana
S_Karitiana-1
0
0.78561


93
S_Mongola-1
Bengali
S_Bengali-1
0
0.78436


94
S_Mongola-1
Kusunda
S_Kusunda-2
0
0.78408


95
S_Mongola-1
Tlingit
S_Tlingit-1
0
0.78388


96
S_Mongola-1
Relli
S_Relli-1
0
0.78344


97
S_Mongola-1
Kapu
S_Kapu-2
0
0.78280


98
S_Mongola-1
Madiga
S_Madiga-1
0
0.78227


99
S_Mongola-1
Madiga
S_Madiga-2
0
0.78175


100
S_Mongola-1
Mala
S_Mala-3
0
0.78161


101
S_Mongola-1
Yadava
S_Yadava-1
0
0.78157


102
S_Mongola-1
Bengali
S_Bengali-2
0
0.78140


103
S_Mongola-1
Kapu
S_Kapu-1
0
0.78130


104
S_Mongola-1
Irula
S_Irula-2
0
0.78128


105
S_Mongola-1
Mala
S_Mala-2
0
0.78128


106
S_Mongola-1
Punjabi
S_Punjabi-1
0
0.78107


107
S_Mongola-1
Irula
S_Irula-1
0
0.78107


108
S_Mongola-1
Burusho
S_Burusho-2
0
0.78081


109
S_Mongola-1
Yadava
S_Yadava-2
0
0.78078


110

S_Mongola-1
Saami
S_Saami-1
0
0.78063


111
S_Mongola-1
Brahmin
S_Brahmin-2
0
0.78031


112

S_Mongola-1
Saami
S_Saami-2
0
0.78012


113
S_Mongola-1
Relli
S_Relli-2
0
0.77974


114
S_Mongola-1
Punjabi
S_Punjabi-3
0
0.77920


115
S_Mongola-1
Bougainville
S_Bougainville-1
0
0.77900


116
S_Mongola-1
Burusho
S_Burusho-1
0
0.77885


117
S_Mongola-1
Punjabi
S_Punjabi-2
0
0.77885


118
S_Mongola-1
Brahmin
S_Brahmin-1
0
0.77874


119
S_Mongola-1
Bougainville
S_Bougainville-2
0
0.77866


120
S_Mongola-1
Sindhi
S_Sindhi-2
0
0.77851


121
S_Mongola-1
Pathan
S_Pathan-1
0
0.77838


122
S_Mongola-1
Punjabi
S_Punjabi-4
0
0.77776


123

S_Mongola-1
Kurd-Iraq
WGS
0
0.77625


124
S_Mongola-1
Pathan
S_Pathan-2
0
0.77597


125
S_Mongola-1
Ossetian-North
S_Ossetian-1
0
0.77575


126
S_Mongola-1
Russian
S_Russian-1
0
0.77570


127

S_Mongola-1
Finnish
S_Finnish-1
0
0.77476


128
S_Mongola-1
Sindhi
S_Sindhi-1
0
0.77473


129
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-1
0
0.77463


130
S_Mongola-1
Tajik
S_Tajik-2
0
0.77448


131
S_Mongola-1
YANA_UP_WGS
Yana1
0
0.77422


132
S_Mongola-1
Ossetian-North
S_Ossetian-2
0
0.77413


133
S_Mongola-1
Papuan
S_Papuan-10
0
0.77381


134
S_Mongola-1
Balochi
S_Balochi-2
0
0.77365


135
S_Mongola-1
Brahui
S_Brahui-1
0
0.77363


136
S_Mongola-1
Adygei
S_Adygei-1
0
0.77334


137
S_Mongola-1
Makrani
S_Makrani-1
0
0.77334


138
S_Mongola-1
Finnish
S_Finnish-3
0
0.77319


139
S_Mongola-1
Adygei
S_Adygei-2
0
0.77319


140
S_Mongola-1
Kalash
S_Kalash-2
0
0.77319


141
S_Mongola-1
Turkish-Kayseri
S_Turkish-Kayseri-2
0
0.77319


142
S_Mongola-1
Chechen
S_Chechen-1
0
0.77312


143
S_Mongola-1
Papuan
S_Papuan-9
0
0.77307


144
S_Mongola-1
Russian
S_Russian-2
0
0.77288


145
S_Mongola-1
Icelandic
S_Icelandic-1
0
0.77260


146
S_Mongola-1
Finnish
S_Finnish-2
0
0.77258


147
S_Mongola-1
Papuan
S_Papuan-12
0
0.77257


148
S_Mongola-1
Kalash
S_Kalash-1
0
0.77247


149
S_Mongola-1
Lezgin
S_Lezgin-1
0
0.77245


150
S_Mongola-1
Papuan
S_Papuan-8
0
0.77232


151
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-1
0
0.77197


152
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-1
0
0.77194


153
S_Mongola-1
Brahui
S_Brahui-2
0
0.77178


154
S_Mongola-1
Russia_Abkhasian
S_Abkhasian-2
0
0.77170


155
S_Mongola-1
Papuan
S_Papuan-1
0
0.77164


156
S_Mongola-1
Norwegian
S_Norwegian-1
0
0.77159


157
S_Mongola-1
Orcadian
S_Orcadian-2
0
0.77158


158
S_Mongola-1
Estonian
S_Estonian-1
0
0.77155


159
S_Mongola-1
Papuan
S_Papuan-7
0
0.77150


160
S_Mongola-1
Papuan
S_Papuan-11
0
0.77146


161
S_Mongola-1
Estonian
S_Estonian-2
0
0.77144


162
S_Mongola-1
Papuan
S_Papuan-13
0
0.77131


163
S_Mongola-1
Tajik
S_Tajik-1
0
0.77131


164
S_Mongola-1
Papuan
S_Papuan-14
0
0.77129


165
S_Mongola-1
Hungarian
S_Hungarian-2
0
0.77120


166
S_Mongola-1
Czech
S_Czech-2
0
0.77120


167
S_Mongola-1
Papuan
S_Papuan-3
0
0.77119


168
S_Mongola-1
Icelandic
S_Icelandic-2
0
0.77119


169
S_Mongola-1
Hungarian
S_Hungarian-1
0
0.77111


170
S_Mongola-1
Polish
S_Polish-1
0
0.77110


171
S_Mongola-1
Bulgarian
S_Bulgarian-1
0
0.77106


172
S_Mongola-1
Greek
S_Greek-1
0
0.77103


173
S_Mongola-1
Iranian-Fars
S_Iranian-Fars-2
0
0.77103


174
S_Mongola-1
Papuan
S_Papuan-5
0
0.77101


175
S_Mongola-1
French
S_French-2
0
0.77082


176
S_Mongola-1
Georgian
S_Georgian-1
0
0.77071


177
S_Mongola-1
Balochi
S_Balochi-1
0
0.77062


178
S_Mongola-1
Spanish
S_Spanish-1
0
0.77061


179
S_Mongola-1
Armenian
S_Armenian-1
0
0.77054


180
S_Mongola-1
Papuan
S_Papuan-6
0
0.77049


181
S_Mongola-1
Bergamo
S_Bergamo-2
0
0.77017


182
S_Mongola-1
Papuan
S_Papuan-2
0
0.77008


183
S_Mongola-1
Bulgarian
S_Bulgarian-2
0
0.77007


184
S_Mongola-1
Papuan
S_Papuan-4
0
0.77005


185
S_Mongola-1
Spanish
S_Spanish-2
0
0.76981


186
S_Mongola-1
Greek
S_Greek-2
0
0.76981


187
S_Mongola-1
Basque
S_Basque-1
0
0.76979


188
S_Mongola-1
English
S_English-1
0
0.76977


189
S_Mongola-1
Lezgin
S_Lezgin-2
0
0.76975


190
S_Mongola-1
Tuscan
S_Tuscan-2
0
0.76960


191
S_Mongola-1
Albanian.DG
S_Albanian1
0
0.76953


192
S_Mongola-1
English
S_English-2
0
0.76951


193
S_Mongola-1
Armenian
S_Armenian-2
0
0.76950


194
S_Mongola-1
Sardinian
S_Sardinian-2
0
0.76946


195
S_Mongola-1
Orcadian
S_Orcadian-1
0
0.76909


196
S_Mongola-1
Tuscan
S_Tuscan-1
0
0.76906


197
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-1
0
0.76901


198
S_Mongola-1
Basque
S_Basque-2
0
0.76888


199
S_Mongola-1
Georgian
S_Georgian-2
0
0.76886


200
S_Mongola-1
Jew_Iraqi
S_Iraqi_Jew-2
0
0.76865


201
S_Mongola-1
Jordanian
S_Jordanian-3
0
0.76809


202
S_Mongola-1
French
S_French-1
0
0.76796


203
S_Mongola-1
BedouinB
S_BedouinB-2
0
0.76779


204
S_Mongola-1
Druze
S_Druze-1
0
0.76757


205
S_Mongola-1
Druze
S_Druze-2
0
0.76754


206
S_Mongola-1
Makrani
S_Makrani-2
0
0.76747


207
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-2
0
0.76622


208
S_Mongola-1
Jew_Yemenite
S_Yemenite_Jew-1
0
0.76575


209
S_Mongola-1
Sardinian
S_Sardinian-1
0
0.76564


210
S_Mongola-1
BedouinB
S_BedouinB-1
0
0.76460


211
S_Mongola-1
Jordanian
S_Jordanian-2
0
0.76413


212
S_Mongola-1
Samaritan
S_Samaritan-1
0
0.76396


213
S_Mongola-1
Jordanian
S_Jordanian-1
0
0.76261


214
S_Mongola-1
Saharawi
S_Saharawi-2
0
0.75981


215
S_Mongola-1
Saharawi
S_Saharawi-1
0
0.75964


216
S_Mongola-1
Mozabite
S_Mozabite-1
0
0.75937


217
S_Mongola-1
Mozabite
S_Mozabite-2
0
0.75824


222
S_Mongola-1
Somali
S_Somali-1
0
0.74788


224
S_Mongola-1
Masai
S_Masai-2
0
0.74381


226
S_Mongola-1
Masai
S_Masai-1
0
0.74274


232
S_Mongola-1
Gambian
S_Gambian-2
0
0.73200


233
S_Mongola-1
BantuKenya
S_BantuKenya-1
0
0.73139


234
S_Mongola-1
Luo
S_Luo-2
0
0.73107


235
S_Mongola-1
BantuKenya
S_BantuKenya-2
0
0.73020


236
S_Mongola-1
Luhya
S_Luhya-1
0
0.73005


237
S_Mongola-1
Luhya
S_Luhya-2
0
0.73002


238
S_Mongola-1
Mandenka
S_Mandenka-2
0
0.72934


239
S_Mongola-1
Gambian
S_Gambian-1
0
0.72933


240
S_Mongola-1
Esan
S_Esan-2
0
0.72920


241
S_Mongola-1
Yoruba
S_Yoruba-2
0
0.72879


242
S_Mongola-1
Mandenka
S_Mandenka-1
0
0.72872


243
S_Mongola-1
Yoruba
S_Yoruba-1
0
0.72816


244
S_Mongola-1
Esan
S_Esan-1
0
0.72810


245
S_Mongola-1
Mende
S_Mende-1
0
0.72793


246
S_Mongola-1
Mende
S_Mende-2
0
0.72788


247
S_Mongola-1
Biaka
S_Biaka-1
0
0.72484


248
S_Mongola-1
Biaka
S_Biaka-2
0
0.72347


249

S_Mongola-1
Mbuti
S_Mbuti-3
0
0.72046


250
S_Mongola-1
Mbuti
S_Mbuti-1
0
0.72010


251
S_Mongola-1
Mbuti
S_Mbuti-2
0
0.72005


252
S_Mongola-1
Khomani_San
S_Khomani_San-2
0
0.71521


253
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-2
0
0.71514


254
S_Mongola-1
Ju_hoan_North
S_Ju_hoan_North-3
0
0.71460


255
S_Mongola-1
Khomani_San
S_Khomani_San-1
0
0.71302

</tbody>






PROOF G25 distances shouldn't be trusted

The late Paleolithic African paper showed that there was Eurasian geneflow back to Africa in the Paleolithic that affected pretty much all Africans including Mbuti. In other words even Mbuti got some Eurasian genes during the Paleolithic. Least affected were Khomani and Ju-Hoan.

The IBS list I posted accurately shows this by showing Mongola closer to Mbuti than to Khomani and Ju-Hoan.

The G25 (scaled) on the other hand gets it all wrong. You can try it yourself. It wrongly shows Mongola significantly closer to Khomani-San than Mbuti ! If it gets this wrong then how should the pops be trusted.

Distance to: Mongola
0.918673 Khomani_San
0.98425066 Ju_hoan_North
0.99607508 Mbuti


Here's additional proof something is not right with the G25. Everyone should know that Eurasians such as Kurds should be closest to other Eurasians and not Africans.

G25 also wrongly shows Kurds closer to Yorubans and Esans than to Papuans which is absurd. Additionally, G25 wrongly shows Kurds closer to Sudanese than to Karitiana and Surui.

Additionally G25 wrongly shows Kurds are closer to Jordanians than Kurds to E. Europeans and Uyghur. I can go on and on with the wrong ranking in G25.




<colgroup><col width="26"><col width="124"><col width="127"></colgroup><tbody>
NO
Kurdish
G25 Distance to:


1
Turkish_Kayseri
0.04594


2
Armenian_B
0.04996


3
Abkhasian
0.07100


4
Adygei
0.07185


5
Chechen
0.07279


6
Jordanian
0.09159


7
Balochi
0.12169


8
Albanian
0.12363


9
Brahui
0.12457


10
Bulgarian
0.13177


11
French_Al
0.16473


12
BedouinB
0.16728


13
Hungarian
0.16929


14
Czech
0.18128


15
Basque_French
0.19215


16
Finnish
0.21537


17
Mozabite
0.23311


18
Saharawi
0.26496


19
Uygur
0.28771


20
Hazara
0.28992


21
Kirghiz
0.39622


22
Jarawa
0.42858


23
Somali
0.43369


24
Mongolian
0.46764


25
Mongola
0.55815


26
Eskimo_Sireniki
0.56139


27
Japanese
0.58489


28
Sudanese
0.69730


29
Karitiana
0.71006


30
Surui
0.71489


31
Yoruba
0.74242


32
Esan_Nigeria
0.74434


33
Papuan
0.78951


34
Khomani_San
0.83812


35
Ju_hoan_North
0.90933


36
Mbuti
0.92566

</tbody>
<style type="text/css">td {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}</style>

Zoro
03-10-2021, 06:50 PM
Unlike G25 the Plink IBS gene to gene comparison correctly shows Kurds closer to other Eurasians (Papuans, Karitiana, Surui) than to SSA. It also correctly shows Kurds closer to E. Europeans, Baloch, Brahui, Hazara and Uyghur than to Jordanians etc, etc


<colgroup><col width="32"><col width="123"><col width="100"></colgroup><tbody>
NO
POPULATION
DST


1
Lezgin
0.85119


2
Armenian
0.85040


3
Adygei
0.85039


4
Abkhasian
0.85027


5
Turkish-Kayseri
0.85012


6
Chechen
0.84983


7
Czech
0.84973


8
Hungarian
0.84956


9
Bulgarian
0.84940


10
French
0.84880


11
Basque
0.84860


12
Finnish
0.84860


13
Russian
0.84855


14
Estonian
0.84832


15
Sardinian
0.84817


16
Polish
0.84797


17
Pathan
0.84782


18
Tajik
0.84777


19
Kalash
0.84722


20
Sindhi
0.84702


21
Jew_Yemenite
0.84700


22
Tlingit
0.84695


23
Balochi
0.84675


24
Brahui
0.84615


25
Brahmin
0.84608


26
Samaritan
0.84603


27
BedouinB
0.84589


28
Saami
0.84589


29
Uyghur
0.84578


30
Makrani
0.84567


31
Mansi
0.84565


32
Bengali
0.84557


33
Punjabi
0.84517


34
Hazara
0.84498


35
Kyrgyz_Kyrgyzstan
0.84454


36
Jordanian
0.84422


37
Mala
0.84288


38
Tubalar
0.84250


39
Irula
0.84181


40
Even
0.84074


41
Mongola
0.84070


42
Tu
0.84029


43
Hezhen
0.84020


44
Mixtec
0.84018


45
Yakut
0.84000


46
Burmese
0.83998


47
Mexico_Zapotec.DG
0.83971


48
Xibo
0.83970


49
Naxi
0.83951


50
Han
0.83945


51
Korean
0.83923


52
Japanese
0.83898


53
Mayan
0.83886


54
Khonda_Dora
0.83884


55
Daur
0.83884


56
Tujia
0.83882


57
Quechua
0.83881


58
Eskimo_Sireniki.DG
0.83873


59
Oroqen
0.83861


60
Ulchi
0.83859


61
Eskimo_Naukan.DG
0.83855


62
She
0.83853


63
Miao
0.83845


64
Yi
0.83844


65
Itelmen
0.83824


66
Mixe
0.83819


67
Kinh
0.83813


68
China_Lahu
0.83783


69
Pima
0.83775


70
Thai
0.83774


71
Eskimo_Chaplin.DG
0.83767


72
Cambodian
0.83766


73
YANA_UP_WGS
0.83735


74
Dai
0.83730


75
Kusunda
0.83724


76
Piapoco
0.83703


77
Ami.DG
0.83696


78
Karitiana
0.83687


79
Surui
0.83654


80
Igorot
0.83649


81
Dusun
0.83639


82
Saharawi
0.83398


83
Mozabite
0.83287


84
Bougainville
0.83084


85
Papuan
0.82871


86
Somali
0.81444


87
Masai
0.80654


88
BantuKenya
0.79064


89
Luo
0.79045


90
Gambian
0.78966


91
Luhya
0.78919


92
Mandenka
0.78855


93
Esan
0.78710


94
Mende
0.78708


95
Yoruba
0.78690


96
Biaka
0.78118


97
Mbuti
0.77853


98
Ju_hoan_North
0.77354


99
Khomani_San
0.77330

</tbody>
<style type="text/css">td {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}</style>

Lucas
03-10-2021, 09:00 PM
Unlike G25 the Plink IBS gene to gene comparison correctly shows Kurds closer to other Eurasians (Papuans, Karitiana, Surui) than to SSA. It also correctly shows Kurds closer to E. Europeans, Baloch, Brahui, Hazara and Uyghur than to Jordanians etc, etc


Zoro, but you somewhat compare apples to oranges. List of euclidean distances based on PCA values, and direct gene-to-gene comparison.
Even if IBS would be better for distances between pops, you can't make admixture breakdown using it which most people likes.

Zoro
03-10-2021, 10:27 PM
Zoro, but you somewhat compare apples to oranges. List of euclidean distances based on PCA values, and direct gene-to-gene comparison.
Even if IBS would be better for distances between pops, you can't make admixture breakdown using it which most people likes.

One way to re-word what you just said is one to one gene to gene comparison using IBS is more accurate method than G25 or Admixture calculator in determining genetic similarity between 2 pops say Kurds and Bulgarians or Mongolians.

I'm reminded of something Dilawer told me a while back. He said Admixture or PCA based methods don't accurately portray genetic similarity between 2 populations like one to one IBS comparison. They just cluster based on geography and not based on genes. That's partly the reason why individuals in a population have all sorts of phenotypes but Admixture or PCA still clusters them together.

Although PCA or Admixture clusters Kurds or Poles within clusters, if one does IBS on individual Poles or Kurds then they may show widely differing results with regards to genetic similarity with Siberians or E. Asians depending on which components the calculator uses or what samples the G25 PCA used. By contrast, IBS results are not depending on this stuff and have no relevance to what samples are used.

This may in fact be more closely aligned with their phenotypes than G25 or Admixture results which would cluster the Poles or Kurds within clusters and these clusters would not explain their individualistic phenotypes like IBS would explain.

Komintasavalta
03-10-2021, 11:43 PM
You created merged dataset or you find it somewhere?

It's from this post by Razib Khan: https://www.gnxp.com/WordPress/2018/07/11/tutorial-to-run-pca-admixture-treemix-and-pairwise-fst-in-one-command/.


Even if IBS would be better for distances between pops, you can't make admixture breakdown using it which most people likes.

Khvorykh et al. 2020 even did admixture-style analysis based on the number of shared IBD segments: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7696950/:


The fourth stage of our computations is unique to this research and was absent in Fedorova et al. 2016. In this stage, we created Supplementary Table S4 using the program rankingATLAS2_v9.pl, and the data from the Supplementary Table S1 ("IBD Normalized Numbers"). Supplementary Table S4 presents the percentages of relative relatedness of each population to the nine Distinct Human Genetic Regions (DHGRs) (AFE, AFW, AMR, EUR, ARC, EAS, OCE, SAS, and MDE, see Results section). For each population (e.g., Georgia) the program counts the numbers of shared IBD fragments per pair of individuals for this population with the three representatives of DHGR region and then makes a sum of these three numbers. For example, the for the AFE region, the summing number of shared IBDs will be the following: 0.48 IBDs (per pair for Georgia vs. LWK) + 0.92 (Georgia vs. Din_AFR) + 3.12 (Georgia vs. Mas_AFR) = 4.52 (for the AFE group). And so on for each DHGR group. In order to minimize the Founder effect in our calculations, we created an upper threshold of 100 shared IBD segments for any populational pair. For example, in a calculation of Congo (Con_AFR) vs. LWK, the original value was 151.9, however, with the threshold in place, the program changed the value to 100). Finally, we calculated the relative percentages for all 9 components (AFE, AFW, AMR, EUR, ARC, EAS, OCE, SAS, and MDE) in a way that ensured their sum was always 100%. Ranking data for each population (as presented in Table 2) were also obtained by rankingATLAS2_v9.pl.

Here's a graph I made of some populations from Khvorykh's table S4:

https://i.ibb.co/3dkkgnx/khvorykh-ibd.png


curl -Ls pastebin.com/raw/BmNdqWvi|tr -d \\r>/tmp/tables4
printf %s\\n Sau_MDE Ira_MDE Rom_EUR Gre_EUR Ger_EUR GBR_EUR Swe_EUR Lat_EUR Rus_EUR Est_EUR Fin_EUR FIN_EUR Ing_EUR Kar_EUR Vep_EUR Saa_EUR Mor_EUR Kom_EUR Udm_EUR Mar_EUR Mis_EUR Kry_EUR Tat_EUR Chu_EUR BSh_EUR Man_SIB Kha_SIB Tun_SIB For_SIB Nen_SIB Nga_SIB Bur_SIB Yak_SIB Ale_ARC>/tmp/pop
awk -F, 'NR==1{print;next}NR==FNR{a[$1]=$0;next}$1 in a{print a[$1]}' /tmp/tables4 /tmp/pop|awk -F, -v OFS=, '{print$2,$6,$11,$10,$7,$8,$5,$9,$3,$4}'>/tmp/a
R -e 'library("ggplot2")
library("reshape2");

t=read.csv("/tmp/a",header=T,check.names=F)

t2=melt(t,id.var="Population")

lab=round(t2$value)
lab[lab<=2]=""
t2$lab=lab
t2$value=t2$value/100

ggplot(t2,aes(x=fct_rev(factor(Population,level=un ique(Population))),y=value,fill=variable))+
geom_bar(stat="identity",width=1,position=position_fill(reverse=T))+
geom_text(aes(label=lab),position=position_stack(v just=.5,reverse=T),size=2.5)+
coord_flip()+
theme(
axis.text=element_text(color="black"),
axis.text.x=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
legend.margin=margin(0),
legend.title=element_blank(),
panel.background=element_rect(fill="white"),
)+
xlab("")+
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))+
ggsave("/tmp/a.png",width=6,height=7)'

The proportion of the Northern European component was defined based on the number of shared IBD segments with Estonians, Germans, and Swedes. So for example Swedes have a higher proportion of the Northern European component than Latvians.

Komintasavalta
03-11-2021, 12:14 AM
BTW what was G25 made with? The AG user anglesqueville said it was made with SmartPCA (https://anthrogenica.com/showthread.php?22231-does-g25-have-a-north-and-especially-east-european-bias/page2):


G25 is not a so-called "calculator", it is a PCA calculated directly on a large "raw data" database (of allele readings) using a well-known program (smartpca, Eigensoft package, Nick Patterson).

However when I tried googling "smartpca site:eurogenes.blogspot.com", there were only two hits, neither of which even matched text written by Davidski.

It's possible to encode a 10,000 by 10,000 matrix of distances between populations as a 10,000 by 25 matrix where the columns are PC components. Then you can retrieve the original distances between two rows of the table fairly accurately by calculating the Euclidean distance between the rows.

For example here I generated a 12 by 12 matrix of FST distances:


R -e 'library(admixtools);
f2m=function(x){t=as.data.frame(x[,1:3]);t2=rbind(t,setNames(t[,c(2,1,3)],names(t)));xtabs(t2[,3]~t2[,2]+t2[,1])};
fst=fst("g/v44.3_1240K_public/v44.3_1240K_public",c("Biaka.DG","Even.DG","Finnish.DG","Ju_hoan_North.DG","Khomani_San.DG","Korean.DG","Mbuti.DG","Mongola.DG","Papuan.DG","Turkey_N.DG","Yoruba.DG"));
write.csv(round(f2m(fst),6),"fst",quote=F)'
$ cat fst
,Biaka.DG,Even.DG,Finnish.DG,Ju_hoan_North.DG,Khom ani_San.DG,Korean.DG,Mbuti.DG,Mongola.DG,Papuan.DG ,Turkey_N.DG,Yoruba.DG
Biaka.DG,0,0.212276,0.182032,0.086521,0.093686,0.2 08092,0.055175,0.200832,0.264921,0.19757,0.037891
Even.DG,0.212276,0,0.099165,0.260155,0.269936,0.02 7304,0.243293,0.020451,0.188681,0.138516,0.189624
Finnish.DG,0.182032,0.099165,0,0.22675,0.236001,0. 102589,0.211397,0.089601,0.188651,0.03734,0.156253
Ju_hoan_North.DG,0.086521,0.260155,0.22675,0,0.034 955,0.255676,0.102751,0.247671,0.311007,0.244202,0 .108353
Khomani_San.DG,0.093686,0.269936,0.236001,0.034955 ,0,0.264307,0.110281,0.256679,0.319966,0.253402,0. 115599
Korean.DG,0.208092,0.027304,0.102589,0.255676,0.26 4307,0,0.238141,0.001142,0.178226,0.136865,0.18475 6
Mbuti.DG,0.055175,0.243293,0.211397,0.102751,0.110 281,0.238141,0,0.230583,0.294664,0.228177,0.077978
Mongola.DG,0.200832,0.020451,0.089601,0.247671,0.2 56679,0.001142,0.230583,0,0.171326,0.130389,0.1765 66
Papuan.DG,0.264921,0.188681,0.188651,0.311007,0.31 9966,0.178226,0.294664,0.171326,0,0.215617,0.24197 7
Turkey_N.DG,0.19757,0.138516,0.03734,0.244202,0.25 3402,0.136865,0.228177,0.130389,0.215617,0,0.17299 2
Yoruba.DG,0.037891,0.189624,0.156253,0.108353,0.11 5599,0.184756,0.077978,0.176566,0.241977,0.172992, 0

Classical multidimensional scaling (MDS) produces identical coordinates with PCA, but the difference is that it takes a distance matrix as an input. I used MDS to reduce the distance matrix to three principal components:


$ R -e 't=read.csv("fst",row.names=1,header=T);cmdscale(as.dist(t),k=3)'
[,1] [,2] [,3]
Biaka.DG 0.09458067 -0.009318035 0.0007634203
Even.DG -0.10587237 0.033672133 -0.0493091783
Finnish.DG -0.06971126 0.039180919 0.0443036464
Ju_hoan_North.DG 0.14384037 -0.005407783 -0.0079752958
Khomani_San.DG 0.15305612 -0.005072182 -0.0095401289
Korean.DG -0.10263674 0.022172427 -0.0479094108
Mbuti.DG 0.12082958 -0.006742200 -0.0017669591
Mongola.DG -0.09712661 0.017649424 -0.0402117613
Papuan.DG -0.13332805 -0.137725617 0.0231446908
Turkey_N.DG -0.07026603 0.060365299 0.0804792633
Yoruba.DG 0.06663432 -0.008774385 0.0080217135

Then even though there are only 3 principal components, I can still retrieve the original distance between a pair of populations fairly accurately:


$ R -e 't=read.csv("fst",row.names=1,header=T);c=cmdscale(as.dist(t),k=3); sqrt(sum((c["Biaka.DG",]-c["Even.DG",])^2))
[1] 0.2110375

With 25 components, it's possible to encode the distances even between tens of thousands of populations more or less accurately. If more components would be necessary, you could just as well make a G50 or G100 or something.

Zoro
03-11-2021, 01:53 AM
BTW what was G25 made with? The AG user anglesqueville said it was made with SmartPCA (https://anthrogenica.com/showthread.php?22231-does-g25-have-a-north-and-especially-east-european-bias/page2):


G25 is not a so-called "calculator", it is a PCA calculated directly on a large "raw data" database (of allele readings) using a well-known program (smartpca, Eigensoft package, Nick Patterson).

However when I tried googling "smartpca site:eurogenes.blogspot.com", there were only two hits, neither of which even matched text written by Davidski.

It's possible to encode a 10,000 by 10,000 matrix of distances between populations as a 10,000 by 25 matrix where the columns are PC components. Then you can retrieve the original distances between two rows of the table fairly accurately by calculating the Euclidean distance between the rows.

For example here I generated a 12 by 12 matrix of FST distances:


R -e 'library(admixtools);
f2m=function(x){t=as.data.frame(x[,1:3]);t2=rbind(t,setNames(t[,c(2,1,3)],names(t)));xtabs(t2[,3]~t2[,2]+t2[,1])};
fst=fst("g/v44.3_1240K_public/v44.3_1240K_public",c("Biaka.DG","Even.DG","Finnish.DG","Ju_hoan_North.DG","Khomani_San.DG","Korean.DG","Mbuti.DG","Mongola.DG","Papuan.DG","Turkey_N.DG","Yoruba.DG"));
write.csv(round(f2m(fst),6),"fst",quote=F)'
$ cat fst
,Biaka.DG,Even.DG,Finnish.DG,Ju_hoan_North.DG,Khom ani_San.DG,Korean.DG,Mbuti.DG,Mongola.DG,Papuan.DG ,Turkey_N.DG,Yoruba.DG
Biaka.DG,0,0.212276,0.182032,0.086521,0.093686,0.2 08092,0.055175,0.200832,0.264921,0.19757,0.037891
Even.DG,0.212276,0,0.099165,0.260155,0.269936,0.02 7304,0.243293,0.020451,0.188681,0.138516,0.189624
Finnish.DG,0.182032,0.099165,0,0.22675,0.236001,0. 102589,0.211397,0.089601,0.188651,0.03734,0.156253
Ju_hoan_North.DG,0.086521,0.260155,0.22675,0,0.034 955,0.255676,0.102751,0.247671,0.311007,0.244202,0 .108353
Khomani_San.DG,0.093686,0.269936,0.236001,0.034955 ,0,0.264307,0.110281,0.256679,0.319966,0.253402,0. 115599
Korean.DG,0.208092,0.027304,0.102589,0.255676,0.26 4307,0,0.238141,0.001142,0.178226,0.136865,0.18475 6
Mbuti.DG,0.055175,0.243293,0.211397,0.102751,0.110 281,0.238141,0,0.230583,0.294664,0.228177,0.077978
Mongola.DG,0.200832,0.020451,0.089601,0.247671,0.2 56679,0.001142,0.230583,0,0.171326,0.130389,0.1765 66
Papuan.DG,0.264921,0.188681,0.188651,0.311007,0.31 9966,0.178226,0.294664,0.171326,0,0.215617,0.24197 7
Turkey_N.DG,0.19757,0.138516,0.03734,0.244202,0.25 3402,0.136865,0.228177,0.130389,0.215617,0,0.17299 2
Yoruba.DG,0.037891,0.189624,0.156253,0.108353,0.11 5599,0.184756,0.077978,0.176566,0.241977,0.172992, 0

Classical multidimensional scaling (MDS) produces identical coordinates with PCA, but the difference is that it takes a distance matrix as an input. I used MDS to reduce the distance matrix to three principal components:


$ R -e 't=read.csv("fst",row.names=1,header=T);cmdscale(as.dist(t),k=3)'
[,1] [,2] [,3]
Biaka.DG 0.09458067 -0.009318035 0.0007634203
Even.DG -0.10587237 0.033672133 -0.0493091783
Finnish.DG -0.06971126 0.039180919 0.0443036464
Ju_hoan_North.DG 0.14384037 -0.005407783 -0.0079752958
Khomani_San.DG 0.15305612 -0.005072182 -0.0095401289
Korean.DG -0.10263674 0.022172427 -0.0479094108
Mbuti.DG 0.12082958 -0.006742200 -0.0017669591
Mongola.DG -0.09712661 0.017649424 -0.0402117613
Papuan.DG -0.13332805 -0.137725617 0.0231446908
Turkey_N.DG -0.07026603 0.060365299 0.0804792633
Yoruba.DG 0.06663432 -0.008774385 0.0080217135

Then even though there are only 3 principal components, I can still retrieve the original distance between a pair of populations fairly accurately:


$ R -e 't=read.csv("fst",row.names=1,header=T);c=cmdscale(as.dist(t),k=3); sqrt(sum((c["Biaka.DG",]-c["Even.DG",])^2))
[1] 0.2110375

With 25 components, it's possible to encode the distances even between tens of thousands of populations more or less accurately. If more components would be necessary, you could just as well make a G50 or G100 or something.

Very good. You're thinking out of the box!. Yes of course you can make a calculator based on FST or IBS. You can do IBS between target and WHG, ENF, ANS, etc and even square the individual results to create bigger differences between target and assign each a prorated proportion of 100%.

At least it wouldn't have the biases and variability of results like G25 or Admixture where the results depend on the other samples in the runs.

Peterski
03-11-2021, 03:53 AM
One way to re-word what you just said is one to one gene to gene comparison using IBS is more accurate method than G25 or Admixture calculator in determining genetic similarity between 2 pops say Kurds and Bulgarians or Mongolians.

I'm reminded of something Dilawer told me a while back. He said Admixture or PCA based methods don't accurately portray genetic similarity between 2 populations like one to one IBS comparison. They just cluster based on geography and not based on genes. That's partly the reason why individuals in a population have all sorts of phenotypes but Admixture or PCA still clusters them together.

Although PCA or Admixture clusters Kurds or Poles within clusters, if one does IBS on individual Poles or Kurds then they may show widely differing results with regards to genetic similarity with Siberians or E. Asians depending on which components the calculator uses or what samples the G25 PCA used. By contrast, IBS results are not depending on this stuff and have no relevance to what samples are used.

This may in fact be more closely aligned with their phenotypes than G25 or Admixture results which would cluster the Poles or Kurds within clusters and these clusters would not explain their individualistic phenotypes like IBS would explain.
It's from this post by Razib Khan: https://www.gnxp.com/WordPress/2018/07/11/tutorial-to-run-pca-admixture-treemix-and-pairwise-fst-in-one-command/.



Khvorykh et al. 2020 even did admixture-style analysis based on the number of shared IBD segments: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7696950/:


The fourth stage of our computations is unique to this research and was absent in Fedorova et al. 2016. In this stage, we created Supplementary Table S4 using the program rankingATLAS2_v9.pl, and the data from the Supplementary Table S1 ("IBD Normalized Numbers"). Supplementary Table S4 presents the percentages of relative relatedness of each population to the nine Distinct Human Genetic Regions (DHGRs) (AFE, AFW, AMR, EUR, ARC, EAS, OCE, SAS, and MDE, see Results section). For each population (e.g., Georgia) the program counts the numbers of shared IBD fragments per pair of individuals for this population with the three representatives of DHGR region and then makes a sum of these three numbers. For example, the for the AFE region, the summing number of shared IBDs will be the following: 0.48 IBDs (per pair for Georgia vs. LWK) + 0.92 (Georgia vs. Din_AFR) + 3.12 (Georgia vs. Mas_AFR) = 4.52 (for the AFE group). And so on for each DHGR group. In order to minimize the Founder effect in our calculations, we created an upper threshold of 100 shared IBD segments for any populational pair. For example, in a calculation of Congo (Con_AFR) vs. LWK, the original value was 151.9, however, with the threshold in place, the program changed the value to 100). Finally, we calculated the relative percentages for all 9 components (AFE, AFW, AMR, EUR, ARC, EAS, OCE, SAS, and MDE) in a way that ensured their sum was always 100%. Ranking data for each population (as presented in Table 2) were also obtained by rankingATLAS2_v9.pl.

Here's a graph I made of some populations from Khvorykh's table S4:

https://i.ibb.co/3dkkgnx/khvorykh-ibd.png


curl -Ls pastebin.com/raw/BmNdqWvi|tr -d \\r>/tmp/tables4
printf %s\\n Sau_MDE Ira_MDE Rom_EUR Gre_EUR Ger_EUR GBR_EUR Swe_EUR Lat_EUR Rus_EUR Est_EUR Fin_EUR FIN_EUR Ing_EUR Kar_EUR Vep_EUR Saa_EUR Mor_EUR Kom_EUR Udm_EUR Mar_EUR Mis_EUR Kry_EUR Tat_EUR Chu_EUR BSh_EUR Man_SIB Kha_SIB Tun_SIB For_SIB Nen_SIB Nga_SIB Bur_SIB Yak_SIB Ale_ARC>/tmp/pop
awk -F, 'NR==1{print;next}NR==FNR{a[$1]=$0;next}$1 in a{print a[$1]}' /tmp/tables4 /tmp/pop|awk -F, -v OFS=, '{print$2,$6,$11,$10,$7,$8,$5,$9,$3,$4}'>/tmp/a
R -e 'library("ggplot2")
library("reshape2");

t=read.csv("/tmp/a",header=T,check.names=F)

t2=melt(t,id.var="Population")

lab=round(t2$value)
lab[lab<=2]=""
t2$lab=lab
t2$value=t2$value/100

ggplot(t2,aes(x=fct_rev(factor(Population,level=un ique(Population))),y=value,fill=variable))+
geom_bar(stat="identity",width=1,position=position_fill(reverse=T))+
geom_text(aes(label=lab),position=position_stack(v just=.5,reverse=T),size=2.5)+
coord_flip()+
theme(
axis.text=element_text(color="black"),
axis.text.x=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
legend.margin=margin(0),
legend.title=element_blank(),
panel.background=element_rect(fill="white"),
)+
xlab("")+
scale_x_discrete(expand=c(0,0))+
scale_y_discrete(expand=c(0,0))+
ggsave("/tmp/a.png",width=6,height=7)'

The proportion of the Northern European component was defined based on the number of shared IBD segments with Estonians, Germans, and Swedes. So for example Swedes have a higher proportion of the Northern European component than Latvians.
BTW what was G25 made with? The AG user anglesqueville said it was made with SmartPCA (https://anthrogenica.com/showthread.php?22231-does-g25-have-a-north-and-especially-east-european-bias/page2):


G25 is not a so-called "calculator", it is a PCA calculated directly on a large "raw data" database (of allele readings) using a well-known program (smartpca, Eigensoft package, Nick Patterson).

However when I tried googling "smartpca site:eurogenes.blogspot.com", there were only two hits, neither of which even matched text written by Davidski.

It's possible to encode a 10,000 by 10,000 matrix of distances between populations as a 10,000 by 25 matrix where the columns are PC components. Then you can retrieve the original distances between two rows of the table fairly accurately by calculating the Euclidean distance between the rows.

For example here I generated a 12 by 12 matrix of FST distances:


R -e 'library(admixtools);
f2m=function(x){t=as.data.frame(x[,1:3]);t2=rbind(t,setNames(t[,c(2,1,3)],names(t)));xtabs(t2[,3]~t2[,2]+t2[,1])};
fst=fst("g/v44.3_1240K_public/v44.3_1240K_public",c("Biaka.DG","Even.DG","Finnish.DG","Ju_hoan_North.DG","Khomani_San.DG","Korean.DG","Mbuti.DG","Mongola.DG","Papuan.DG","Turkey_N.DG","Yoruba.DG"));
write.csv(round(f2m(fst),6),"fst",quote=F)'
$ cat fst
,Biaka.DG,Even.DG,Finnish.DG,Ju_hoan_North.DG,Khom ani_San.DG,Korean.DG,Mbuti.DG,Mongola.DG,Papuan.DG ,Turkey_N.DG,Yoruba.DG
Biaka.DG,0,0.212276,0.182032,0.086521,0.093686,0.2 08092,0.055175,0.200832,0.264921,0.19757,0.037891
Even.DG,0.212276,0,0.099165,0.260155,0.269936,0.02 7304,0.243293,0.020451,0.188681,0.138516,0.189624
Finnish.DG,0.182032,0.099165,0,0.22675,0.236001,0. 102589,0.211397,0.089601,0.188651,0.03734,0.156253
Ju_hoan_North.DG,0.086521,0.260155,0.22675,0,0.034 955,0.255676,0.102751,0.247671,0.311007,0.244202,0 .108353
Khomani_San.DG,0.093686,0.269936,0.236001,0.034955 ,0,0.264307,0.110281,0.256679,0.319966,0.253402,0. 115599
Korean.DG,0.208092,0.027304,0.102589,0.255676,0.26 4307,0,0.238141,0.001142,0.178226,0.136865,0.18475 6
Mbuti.DG,0.055175,0.243293,0.211397,0.102751,0.110 281,0.238141,0,0.230583,0.294664,0.228177,0.077978
Mongola.DG,0.200832,0.020451,0.089601,0.247671,0.2 56679,0.001142,0.230583,0,0.171326,0.130389,0.1765 66
Papuan.DG,0.264921,0.188681,0.188651,0.311007,0.31 9966,0.178226,0.294664,0.171326,0,0.215617,0.24197 7
Turkey_N.DG,0.19757,0.138516,0.03734,0.244202,0.25 3402,0.136865,0.228177,0.130389,0.215617,0,0.17299 2
Yoruba.DG,0.037891,0.189624,0.156253,0.108353,0.11 5599,0.184756,0.077978,0.176566,0.241977,0.172992, 0

Classical multidimensional scaling (MDS) produces identical coordinates with PCA, but the difference is that it takes a distance matrix as an input. I used MDS to reduce the distance matrix to three principal components:


$ R -e 't=read.csv("fst",row.names=1,header=T);cmdscale(as.dist(t),k=3)'
[,1] [,2] [,3]
Biaka.DG 0.09458067 -0.009318035 0.0007634203
Even.DG -0.10587237 0.033672133 -0.0493091783
Finnish.DG -0.06971126 0.039180919 0.0443036464
Ju_hoan_North.DG 0.14384037 -0.005407783 -0.0079752958
Khomani_San.DG 0.15305612 -0.005072182 -0.0095401289
Korean.DG -0.10263674 0.022172427 -0.0479094108
Mbuti.DG 0.12082958 -0.006742200 -0.0017669591
Mongola.DG -0.09712661 0.017649424 -0.0402117613
Papuan.DG -0.13332805 -0.137725617 0.0231446908
Turkey_N.DG -0.07026603 0.060365299 0.0804792633
Yoruba.DG 0.06663432 -0.008774385 0.0080217135

Then even though there are only 3 principal components, I can still retrieve the original distance between a pair of populations fairly accurately:


$ R -e 't=read.csv("fst",row.names=1,header=T);c=cmdscale(as.dist(t),k=3); sqrt(sum((c["Biaka.DG",]-c["Even.DG",])^2))
[1] 0.2110375

With 25 components, it's possible to encode the distances even between tens of thousands of populations more or less accurately. If more components would be necessary, you could just as well make a G50 or G100 or something.

Very interesting!

Lucas
03-11-2021, 09:41 AM
BTW what was G25 made with? The AG user anglesqueville said it was made with SmartPCA (https://anthrogenica.com/showthread.php?22231-does-g25-have-a-north-and-especially-east-european-bias/page2):


G25 is not a so-called "calculator", it is a PCA calculated directly on a large "raw data" database (of allele readings) using a well-known program (smartpca, Eigensoft package, Nick Patterson).

However when I tried googling "smartpca site:eurogenes.blogspot.com", there were only two hits, neither of which even matched text written by Davidski.

It's possible to encode a 10,000 by 10,000 matrix of distances between populations as a 10,000 by 25 matrix where the columns are PC components. Then you can retrieve the original distances between two rows of the table fairly accurately by calculating the Euclidean distance between the rows.

With 25 components, it's possible to encode the distances even between tens of thousands of populations more or less accurately. If more components would be necessary, you could just as well make a G50 or G100 or something.

It is for sure Smart PCA not Plink PCA. Here Davidski said about it: https://eurogenes.blogspot.com/2017/05/pca-projection-bias-fix.html

BTW user vbknethio created G30 using SmartPCA https://www.theapricity.com/forum/showthread.php?314681-BCE-G30-beta-PCA-with-6000-samples. Yes it is identical product like G25, just he didn't have the same samples.

Zanzibar
06-09-2021, 05:16 AM
I wonder why the Mari have like 3-3.5 times more East Eurasian than the Mordovians (~10% vs ~30%). Those two republics are not even that far away from each other.

Could it be that the Mari directly absorbed and mixed with Bashkirs and Ugrics like Khanty-Mansi while Mordovians receive East Eurasian from an indirect source? That could be why.

Anyway do you consider groups like Mari, Udmurt, Saami European or more mixed race?

Zanzibar
06-09-2021, 05:26 AM
I didn't know Udmurts had such high Mongoloid ancestry considering the predominance of red hair in them

Enviado desde mi SM-A107M mediante Tapatalk

Yep they are genetically around 25% Mongoloid at least or a bit more. It's surprising how much red hair they have considering how much East Asian they are. They are literally quapas aka 1/4 Asian lol.

LorenzoSpitaleri
06-09-2021, 06:12 AM
Yep they are genetically around 25% Mongoloid at least or a bit more. It's surprising how much red hair they have considering how much East Asian they are. They are literally quapas aka 1/4 Asian lol.

Very impressing, I used to think mongoloid genes negated any chance of red hair appearing. But now that I think of it, I have seen even straight up mestizos and even darker mixes with the hair colour.

Zanzibar
06-09-2021, 07:06 AM
Very impressing, I used to think mongoloid genes negated any chance of red hair appearing. But now that I think of it, I have seen even straight up mestizos and even darker mixes with the hair colour.

Woah you have seen mestizos with red hair? By straight up you mean someone who is 50-50% or has one white parent and another Native parent?

Do you consider Udmurts and others like Mari, Saami European or do think they are mixed race? They are the reverse version of Siberian Turkic/Central Asian ethnicities like Altaians, Kyrgyzs, some Kazakhs who are approximately 25-35% Caucasoid lol.

They are literally the equivalent of someone who has an Asian grandparent lol.

Komintasavalta
06-09-2021, 04:44 PM
Could it be that the Mari directly absorbed and mixed with Bashkirs and Ugrics like Khanty-Mansi while Mordovians receive East Eurasian from an indirect source? That could be why.

Bashkirs have a lot of eastern mtDNA haplogroups but Maris don't. The distribution of eastern mtDNA haplogroups in Bashkirs is otherwise similar to Khanty and Mansi, but Bashkirs have more F, which is common in Shors and Khakasses.

One mysterious thing is that Udmurts have a huge amount of eastern mtDNA compared to Maris and Chuvashes.

https://i.ibb.co/hZHvfF5/tambets-ydna-mtdna-xy.png
https://i.ibb.co/ys9zNMj/complexheatmap-tambets-2018-mtdna.png

LorenzoSpitaleri
06-10-2021, 12:45 AM
Woah you have seen mestizos with red hair? By straight up you mean someone who is 50-50% or has one white parent and another Native parent?

Do you consider Udmurts and others like Mari, Saami European or do think they are mixed race? They are the reverse version of Siberian Turkic/Central Asian ethnicities like Altaians, Kyrgyzs, some Kazakhs who are approximately 25-35% Caucasoid lol.

They are literally the equivalent of someone who has an Asian grandparent lol.

With that I meant people with very strong mixed amerindian features, looked balanced mestizo. And even people less white than that. A good example is this guy from my city, now I don't know him personally but I reckon he is known for being a redhead triracial nigga.
https://i.imgur.com/JJ2hOtI.jpg
https://i.imgur.com/rkVF7OC.jpg
https://i.imgur.com/eN0hv9E.jpg

Zanzibar
06-10-2021, 03:33 AM
With that I meant people with very strong mixed amerindian features, looked balanced mestizo. And even people less white than that. A good example is this guy from my city, now I don't know him personally but I reckon he is known for being a redhead triracial nigga.


That's his natural hair color? Ah ok. Have you see Amerindians with red hair though?

Btw do you consider Udmurts to be mixed race by their high Mongoloid dna?

LorenzoSpitaleri
06-10-2021, 06:14 PM
That's his natural hair color? Ah ok. Have you see Amerindians with red hair though?

Btw do you consider Udmurts to be mixed race by their high Mongoloid dna?
Yes it is. And yes but mixed amerindians not pure.

They're on the way like I said somewhere else, mixed-white range

Zanzibar
06-11-2021, 01:48 AM
Yes it is. And yes but mixed amerindians not pure.

They're on the way like I said somewhere else, mixed-white range

Ok.

Alright. Udmurts score the same amount of Mongoloid as Castizos scoring Amerindian and Quadroons would score SSA. Would you also consider Castizos and Quadroons as being on the way, on the mixed-white range as well?

Roy
06-12-2021, 06:50 PM
Unlike G25 the Plink IBS gene to gene comparison correctly shows Kurds closer to other Eurasians (Papuans, Karitiana, Surui) than to SSA. It also correctly shows Kurds closer to E. Europeans, Baloch, Brahui, Hazara and Uyghur than to Jordanians etc, etc


<colgroup><col width="32"><col width="123"><col width="100"></colgroup><tbody>
NO
POPULATION
DST


1
Lezgin
0.85119


2
Armenian
0.85040


3
Adygei
0.85039


4
Abkhasian
0.85027


5
Turkish-Kayseri
0.85012


6
Chechen
0.84983


7
Czech
0.84973


8
Hungarian
0.84956


9
Bulgarian
0.84940


10
French
0.84880


11
Basque
0.84860


12
Finnish
0.84860


13
Russian
0.84855


14
Estonian
0.84832


15
Sardinian
0.84817


16
Polish
0.84797


17
Pathan
0.84782


18
Tajik
0.84777


19
Kalash
0.84722


20
Sindhi
0.84702


21
Jew_Yemenite
0.84700


22
Tlingit
0.84695


23
Balochi
0.84675


24
Brahui
0.84615


25
Brahmin
0.84608


26
Samaritan
0.84603


27
BedouinB
0.84589


28
Saami
0.84589


29
Uyghur
0.84578


30
Makrani
0.84567


31
Mansi
0.84565


32
Bengali
0.84557


33
Punjabi
0.84517


34
Hazara
0.84498


35
Kyrgyz_Kyrgyzstan
0.84454


36
Jordanian
0.84422


37
Mala
0.84288


38
Tubalar
0.84250


39
Irula
0.84181


40
Even
0.84074


41
Mongola
0.84070


42
Tu
0.84029


43
Hezhen
0.84020


44
Mixtec
0.84018


45
Yakut
0.84000


46
Burmese
0.83998


47
Mexico_Zapotec.DG
0.83971


48
Xibo
0.83970


49
Naxi
0.83951


50
Han
0.83945


51
Korean
0.83923


52
Japanese
0.83898


53
Mayan
0.83886


54
Khonda_Dora
0.83884


55
Daur
0.83884


56
Tujia
0.83882


57
Quechua
0.83881


58
Eskimo_Sireniki.DG
0.83873


59
Oroqen
0.83861


60
Ulchi
0.83859


61
Eskimo_Naukan.DG
0.83855


62
She
0.83853


63
Miao
0.83845


64
Yi
0.83844


65
Itelmen
0.83824


66
Mixe
0.83819


67
Kinh
0.83813


68
China_Lahu
0.83783


69
Pima
0.83775


70
Thai
0.83774


71
Eskimo_Chaplin.DG
0.83767


72
Cambodian
0.83766


73
YANA_UP_WGS
0.83735


74
Dai
0.83730


75
Kusunda
0.83724


76
Piapoco
0.83703


77
Ami.DG
0.83696


78
Karitiana
0.83687


79
Surui
0.83654


80
Igorot
0.83649


81
Dusun
0.83639


82
Saharawi
0.83398


83
Mozabite
0.83287


84
Bougainville
0.83084


85
Papuan
0.82871


86
Somali
0.81444


87
Masai
0.80654


88
BantuKenya
0.79064


89
Luo
0.79045


90
Gambian
0.78966


91
Luhya
0.78919


92
Mandenka
0.78855


93
Esan
0.78710


94
Mende
0.78708


95
Yoruba
0.78690


96
Biaka
0.78118


97
Mbuti
0.77853


98
Ju_hoan_North
0.77354


99
Khomani_San
0.77330

</tbody>
<style type="text/css">td {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}</style>

Why Kurds are closer to Europeans than to Jordanians? It makes no sense to me.

Ayetooey
06-12-2021, 06:56 PM
Because a lot of users here are Pan-Europeanists who refuse to acknowledge how genetically diverse this continent truly is.