Log in

View Full Version : G25 models on VURers and other Uralics



Zanzibar
03-03-2021, 02:45 AM
What do you think? The distance runs are not that good but give good rough estimates into the autosomal composition of each Uralic and VURer population. Hungarians and Estonians seem to be the least genetically Uralic here and have the highest EEF meanwhile Khanty, Mansi, Selkup and Nenets seem to be the most Uralic and have the least EEF admixture.

https://i.imgur.com/pOsEBrV.png

What I find very interesting is how very low Neolithic Farmer that Saamis which is even lower than the amount that VURers possessed, Saamis seem to be an almost isolated population despite being located in Northern Scandinavia. On the other hand, Saami_Kola despite being in Russia, actually possess higher EEF than the Saami samples which probably indicate recent admixture with other European populations.

Comparison between Saami and Saami_Kola: as you can see, Saami_Kola are genetically more European and less Mongoloid than the Saami

https://i.imgur.com/SujGz5Q.png

Now I decided to model VURers and other Uralics with neighboring populations and the results are surprising:

https://i.imgur.com/LGdbUct.png

P.S.- FIN_Levanluhta_o is an ancient Germanic individual from Finland, Czech_Early_Slav is an ancient Slav from Czechia, Baltic_EST_BA is a Bronze Age individual from Estonia, Uyelgi is an ancient Uralic, Ancient_Finno_Ugric is made by using an ancient Finnic individual from Viking Age, Norway (VK2020_NOR_North_VA_o1), KAZ_Botai is a West Siberian Hunter Gatherer population who are a mix of ANE+EHG+East Asian, Altaian and Uzbek are there to detect if there is any Turkic admixture, Tajik_Yagnobi and Iranian_Zoroastrian are utilized for proxies of Iranic affinity

I decided to do the same thing with the Saami and Saami_Kola individuals: Seems like most of the non-Uralic admixture in Saamis comes from Baltics followed by Nordics (Levanluhta_o) meanwhile the non-Uralic ancestry in Saami_Kola seems to come from Balts, Scandinavians and Slavs.

https://i.imgur.com/2wlrJnm.png

Zanzibar
03-03-2021, 06:56 AM
Here are how individual Udmurts, Maris, Chuvashs, Besermyans, Komis, Bashkirs score:

https://i.imgur.com/XY6NfIP.png

https://i.imgur.com/aryW4LI.png

https://i.imgur.com/TycNKPb.png

https://i.imgur.com/IDiadhs.png

Here is how they score in another model that I simulated:

https://i.imgur.com/fMwCMhF.png

https://i.imgur.com/z7CGGop.png

https://i.imgur.com/n3Dw5oQ.png

https://i.imgur.com/Xhcov9S.png

Zanzibar
03-03-2021, 07:15 AM
Here are how individual Mansi, Khanty, Nenets, Selkups score:

https://i.imgur.com/LHb1ZJF.png

https://i.imgur.com/EUS6WgK.png

https://i.imgur.com/GGEMJ6A.png

Compare to another model I did:

https://i.imgur.com/WVanK85.png

https://i.imgur.com/vgva93l.png

https://i.imgur.com/x2lwqc5.png

Zanzibar
03-03-2021, 09:27 AM
Here are the results of individual Finns, Karelians, Vepsians, Ingrians, Mordovians, Estonians, Russian_Pinega and Hungarians:

https://i.imgur.com/xuASUwa.png

https://i.imgur.com/aESc6uT.png

https://i.imgur.com/ad7BcXA.png

https://i.imgur.com/NmLrbrq.png

https://i.imgur.com/SpWWTZC.png

Here is another model of them:

https://i.imgur.com/kALHBe4.png

https://i.imgur.com/PrbTrlI.png

https://i.imgur.com/AmffQiq.png

https://i.imgur.com/7Rf2gLU.png

https://i.imgur.com/9eXhMgH.png

Komintasavalta
03-03-2021, 12:19 PM
I tried using all populations from your first model as targets, except I didn't use Tundra_Nentsi or Forest_Nentsi, because they are not included in the official datasheet, and I added Nganasans who were missing. I used all rows from the ancient averages sheet as sources, except rows containing "_o", "_contam", or "_low_res".

These were initially the sources that got the highest average percentage:

15.0 RUS_Krasnoyarsk_BA
11.6 FIN_Levanluhta_IA
5.8 Baltic_EST_BA
5.4 UKR_MBA
5.4 KAZ_Golden_Horde_Euro
4.9 RUS_Chalmny-Varre
4.0 VK2020_UKR_Shestovitsa_VA
3.7 RUS_Ingria_IA
3.4 Baltic_LTU_BA
2.3 Baltic_EST_MA
2.2 RUS_Yakutia_Ymyiakhtakh_LN
1.9 VK2020_SWE_Uppsala_VA
1.9 RUS_Samara_HG
1.7 RUS_Maykop
1.7 KAZ_Botai
1.6 MNG_Chandman_IA
1.5 MNG_SHU002
1.5 KAZ_Zevakinskiy_BA
1.4 CZE_Early_Slav
1.2 VK2020_RUS_Kurevanikha_VA
1.2 USA_colonial_period
1.2 KAZ_Tasbas_IA
1.1 DEU_LBK_KD
1.0 VK2020_POL_Sandomierz_VA
0.8 KAZ_Kimak
0.7 VK2020_RUS_Pskov_VA
0.7 RUS_Tyumen_HG

Next I did a PCA of the sources above with clustering:

https://i.ibb.co/0qgJwf3/a.png

I picked one popultaion from each cluster as source in a new model. I didn't even try to choose the sources so that they minimized the average distance. For example I picked KAZ_Golden_Horde_Euro from the Northern European cluster because it sounded the coolest.

https://i.ibb.co/hFPrBKc/uralic-vur-vahaduo-model.png

The average distance was .020 for the first model, .022 for the second model, and .029 for the third model.

What's surprising is that Chuvashes, Maris, and Udmurts got such a high percentage of Chalmny Varre. Chalmny Varre is a 18th-19th century Saami cemetery on the Kola Peninsula. Maybe Udmurts and Besermyans have a lower percentage of Chalmny Varre partially because they have over 10% RUS_Tyumen_HG (WSHG).

Nganasans are modeled as 98% RUS_Krasnoyarsk_BA (which only consists of the single individual kra001 that Davidski has recently speculated might be Proto-Uralic). When I removed Chalmny Varre, Finnish_East got 7% Krasnoyarsk_BA, Udmurts got 19%, Saami got 24%, and Maris got 29%.

MNG_SHU002 is the main Mongoloid component in Tatar_Siberian, Tatar_Crimean_steppe, and Tatar_Lipka. However the main Mongoloid component is the more Uralic-like RUS_Krasnoyarsk_BA in Zabolotniye Tatars (swamp Tatars), who are considered to be turkified Uralics.

The proportion of Chalmny Varre is 4% in Estonians, 22% in Finns, 33% in Karelians, 37% in Vepsians, and 43% in Pinega Russians. They all have 0% of Krasnoyarsk_BA.

DEU_LBK_KD and RUS_Maykop are the two woggiest component. LBK (Linearbandkeramik) has the highest percentage in Hungarians as expected. RUS_Maykop is more common in the Volga-Ural region but absent among Baltic Finnic peoples and Saami. Out of modern population averages, it is the closest to Tajiks and churkas:

Distance to: RUS_Maykop
.066 Tajik_Rushan
.070 Tajik_Shugnan
.071 Darginian
.072 Tajik_Yagnobi
.077 Avar
.077 Lak
.078 Kubachinian
.079 Kaitag
.081 Tabasaran
.082 Tajik_Ishkashim
.098 Tajik
.099 Chechen
.105 Balkar
.106 Kumyk
.107 Ingushian
.109 Cherkes

You can generate a clustered heatmap of CSV data from Vahaduo's MULTI tab by running this in R:


library(pheatmap)

download.file("https://drive.google.com/uc?export=download&id=1wZr-UOve0KUKo_Qbgeo27m-CQncZWb8y","modernave")

t=read.csv("input-from-multi-tab-in-vahaduo.csv",header=T,row.names=1,check.names=F)
avedist=t[nrow(t),1]
t=t[-nrow(t),]
t=t[order(row.names(t)),]

t2=read.csv("modernave",header=T,row.names=1,check.names=F)
t3=t2[row.names(t2)%in%row.names(t),]
k=hclust(dist(t3))

row.names(t)=paste0(row.names(t)," (",sub("^0","",sprintf("%.3f",t[,1])),")")
t=t[-c(1)]

pheatmap(
t,
filename="/tmp/a.png",
clustering_callback=function(...){c(k)},
cluster_cols=F,
legend=F,
main=paste("Average distance:",sub("^0","",sprintf("%.3f",avedist))),
cellwidth=16,
treeheight_row=100,
cellheight=16,
fontsize=10,
border_color=NA,
display_numbers=T,
number_format="%.0f",
fontsize_number=8,
number_color="black",
breaks=seq(0,100,100/256),
colorRampPalette(hex(HSV(c(210,210,90,60,40,20,0,0 ),c(0,.4,.6,.6,.6,.6,.6,.8),c(1,1,1,1,1,1,1,.6)))) (256)
)

Here's another PCA with both my source and target populations. Now Tyumen_HG (WSHG) actually ended up clustering with Samara_HG (EHG). I chose a too small number of clusters, so Hungarians clustered together with Finns. PC3 differentiates WSHG from Nganasan, and PC4 differentiates MNG_SHU002 (a Mongoloid source of Turkic peoples) from RUS_Maykop (Tajik/Churka).

https://i.ibb.co/Dp5KW3J/vahaduo-uralic-vur-model-pc12.png
https://i.ibb.co/TmQvkDR/vahaduo-uralic-vur-model-pc34.png

Zanzibar
03-03-2021, 12:35 PM
I tried using all populations from your first model as targets, except I didn't use Tundra_Nentsi or Forest_Nentsi, because they are not included in the official datasheet, and I added Nganasans who were missing. I used all rows from the ancient averages sheet as sources, except rows containing "_o", "_contam", or "_low_res".

These were initially the sources that got the highest average percentage:

15.0 RUS_Krasnoyarsk_BA
11.6 FIN_Levanluhta_IA
5.8 Baltic_EST_BA
5.4 UKR_MBA
5.4 KAZ_Golden_Horde_Euro
4.9 RUS_Chalmny-Varre
4.0 VK2020_UKR_Shestovitsa_VA
3.7 RUS_Ingria_IA
3.4 Baltic_LTU_BA
2.3 Baltic_EST_MA
2.2 RUS_Yakutia_Ymyiakhtakh_LN
1.9 VK2020_SWE_Uppsala_VA
1.9 RUS_Samara_HG
1.7 RUS_Maykop
1.7 KAZ_Botai
1.6 MNG_Chandman_IA
1.5 MNG_SHU002
1.5 KAZ_Zevakinskiy_BA
1.4 CZE_Early_Slav
1.2 VK2020_RUS_Kurevanikha_VA
1.2 USA_colonial_period
1.2 KAZ_Tasbas_IA
1.1 DEU_LBK_KD
1.0 VK2020_POL_Sandomierz_VA
0.8 KAZ_Kimak
0.7 VK2020_RUS_Pskov_VA
0.7 RUS_Tyumen_HG

Next I did a PCA of the sources above with clustering:

https://i.ibb.co/0qgJwf3/a.png

I picked one popultaion from each cluster as source in a new model. I didn't even try to choose the sources so that they minimized the average distance. For example I picked KAZ_Golden_Horde_Euro from the Northern European cluster because it sounded the coolest.

https://i.ibb.co/hFPrBKc/uralic-vur-vahaduo-model.png

The average distance was .020 for the first model, .022 for the second model, and .029 for the third model.

What's surprising is that Chuvashes, Maris, and Udmurts got such a high percentage of Chalmny Varre. Chalmny Varre is a 18th-19th century Saami cemetery on the Kola Peninsula. Maybe Udmurts and Besermyans have a lower percentage of Chalmny Varre partially because they have over 10% RUS_Tyumen_HG (WSHG).

Nganasans are modeled as 98% RUS_Krasnoyarsk_BA (which only consists of the single individual kra001 that Davidski has recently speculated might be Proto-Uralic). When I removed Chalmny Varre, Finnish_East got 7% Krasnoyarsk_BA, Udmurts got 19%, Saami got 24%, and Maris got 29%.

MNG_SHU002 is the main Mongoloid component in Tatar_Siberian, Tatar_Crimean_steppe, and Tatar_Lipka. However the main Mongoloid component is the more Uralic-like RUS_Krasnoyarsk_BA in Zabolotniye Tatars (swamp Tatars), who are considered to be turkified Uralics.

The proportion of Chalmny Varre is 4% in Estonians, 22% in Finns, 33% in Karelians, 37% in Vepsians, and 43% in Pinega Russians. They all have 0% of Krasnoyarsk_BA.

DEU_LBK_KD and RUS_Maykop are the two woggiest component. LBK (Linearbandkeramik) has the highest percentage in Hungarians as expected. RUS_Maykop is more common in the Volga-Ural region but absent among Baltic Finnic peoples and Saami. Out of modern population averages, it is the closest to Tajiks and churkas:

Distance to: RUS_Maykop
.066 Tajik_Rushan
.070 Tajik_Shugnan
.071 Darginian
.072 Tajik_Yagnobi
.077 Avar
.077 Lak
.078 Kubachinian
.079 Kaitag
.081 Tabasaran
.082 Tajik_Ishkashim
.098 Tajik
.099 Chechen
.105 Balkar
.106 Kumyk
.107 Ingushian
.109 Cherkes

You can generate a clustered heatmap of CSV data from Vahaduo's MULTI tab by running this in R:


library(pheatmap)

download.file("https://drive.google.com/uc?export=download&id=1wZr-UOve0KUKo_Qbgeo27m-CQncZWb8y","modernave")

t=read.csv("input-from-multi-tab-in-vahaduo.csv",header=T,row.names=1,check.names=F)
avedist=t[nrow(t),1]
t=t[-nrow(t),]
t=t[order(row.names(t)),]

t2=read.csv("modernave",header=T,row.names=1,check.names=F)
t3=t2[row.names(t2)%in%row.names(t),]
k=hclust(dist(t3))

row.names(t)=paste0(row.names(t)," (",sub("^0","",sprintf("%.3f",t[,1])),")")
t=t[-c(1)]

pheatmap(
t,
filename="/tmp/a.png",
clustering_callback=function(...){c(k)},
cluster_cols=F,
legend=F,
main=paste("Average distance:",sub("^0","",sprintf("%.3f",avedist))),
cellwidth=16,
treeheight_row=100,
cellheight=16,
fontsize=10,
border_color=NA,
display_numbers=T,
number_format="%.0f",
fontsize_number=8,
number_color="black",
breaks=seq(0,100,100/256),
colorRampPalette(hex(HSV(c(210,210,90,60,40,20,0,0 ),c(0,.4,.6,.6,.6,.6,.6,.8),c(1,1,1,1,1,1,1,.6)))) (256)
)

Here's another PCA with both my source and target populations. Now Tyumen_HG (WSHG) actually ended up clustering with Samara_HG (EHG). I chose a too small number of clusters, so Hungarians clustered together with Finns. PC3 differentiates WSHG from Nganasan, and PC4 differentiates MNG_SHU002 (a Mongoloid source of Turkic peoples) from RUS_Maykop (Tajik/Churka).

https://i.ibb.co/Dp5KW3J/vahaduo-uralic-vur-model-pc12.png
https://i.ibb.co/TmQvkDR/vahaduo-uralic-vur-model-pc34.png

Well, there are individual Forest_Nentsi and Tundra_Nentsi individual samples in G25 but they haven't been averaged yet, so I decided to do it myself. I will have to ask Lucas to include them as well into the modern averages spreadsheet. Here are the averages. You can include them into your runs:



Forest_Nentsi,0.067358893,-0.288337893,0.131682429,0.03657975,-0.1223635,-0.056027214,0.017541964,0.025770893,0.01321375,-0.014442179,0.078851179,0.001760929,0.011961857,-0.066845357,-0.022844643,-0.012894357,-0.004316571,0.006406786,0.010437357,-0.0099065,0.007170393,0.02140075,0.036366893,-0.007160857,-0.004896821

Tundra_Nentsi,0.069782231,-0.273255462,0.127408769,0.034064077,-0.116045308,-0.050779615,0.017155769,0.025703077,0.012586154,-0.013401462,0.069527231,0.001902077,0.010554923,-0.054773769,-0.018228231,-0.009842231,1E-05,0.005145308,0.006072154,-0.007859538,0.008648231,0.016293692,0.027787692,-0.002224615,-0.001381692


Very interesting. Thanks for the feedback! Well you can also use RUS_Shamanka_N or RUS_Devil's Gate for the Mongoloid ancestry of most Tatars.

Btw can you tried to find out how much EEF and CHG each of these Uralics possessed by using TUR_Barcin_N and GEO_CHG? You will have to remove the Chamny_Varre and Kaz_Golden_Horde_Euro components though as they also contain Neolithic and CHG (from Steppe admixture) affinity. Also add Baltic_LVA_HG to get a better distance run it seems.

Is DEU_LBK_KD an EEF component? I checked and they seem closest to Sardinians followed by other South Euros. While RUS_Maykop is closest to MENA.

Komintasavalta
03-03-2021, 01:40 PM
Well, there are individual Forest_Nentsi and Tundra_Nentsi individual samples in G25 but they haven't been averaged yet, so I decided to do it myself. I will have to ask Lucas to include them as well into the modern averages spreadsheet.

Ok, they were only included on Lucas's website but not here: https://eurogenes.blogspot.com/2019/07/getting-most-out-of-global25_12.html.

I used this to calculate population averages for Forest_Nentsi and Tundra_Nentsi, and I got the same result as you:

sed 's/:[^,]*//'|awk -F, '{n[$1]++;for(i=2;i<=NF;i++){a[$1][i]+=$i}}END{for(i in a){o=i;for(j=2;j<=NF;j++)o=o","a[i][j]/n[i];print o}}'


Well you can also use RUS_Shamanka_N or RUS_Devil's Gate for the Mongoloid ancestry of most Tatars.

Yeah or RUS_Lokomotiv_N. It's extremely close to Shamanka:

Distance to: RUS_Lokomotiv_N
.014 RUS_Shamanka_N
.020 RUS_Fofonovo_En
.027 RUS_Baikal_BA_o
.031 MNG_North_N
.033 MNG_Slab_Grave_EIA_1
.034 RUS_Yakutia_Meso
.036 MNG_EIA_3
.046 RUS_Yakutia_N


Btw can you tried to find out how much EEF and CHG each of these Uralics possessed by using TUR_Barcin_N and GEO_CHG? You will have to remove the Chamny_Varre and Kaz_Golden_Horde_Euro components though as they also contain Neolithic and CHG (from Steppe admixture) affinity. Also add Baltic_LVA_HG to get a better distance run it seems.

I tried using TUR_Pinarbasi_HG instead of TUR_Barcin_N. It increased the average distance by about .002 compared to TUR_Barcin_N, but I think it's cooler to use HGs.

I got about .001 lower average distance with NOR_N_HG than with Baltic_LVA_HG. If I included both in sources, Vahaduo only used NOR_N_HG in the MULTI tab.

MNG_SHU002:SHU002 is dated to about about 1212 CE. It gave me a lower average distance than older ancient sources like RUS_Shamanka_N or RUS_Lokomotiv_N. (Which makes sense because the Turkic expansion did not take place in the Neolitihic.)

Finnics had almost the same amount of GEO_CHG as VURers. When I tried replacing GEO_CHG with RUS_Maykop, the percentage in Finnics became clearly lower than in VURers.

The proportion of MNG_SHU002 (Mongoloid source of Turkic peoples) was 14% in Turkish_Southwest and 29% in Turkmen. Both of them also had fairly high WSHG.

Swamp Tatars still have about 3 times higher RUS_Krasnoyarsk_BA (kra001) than MNG_SHU002. Also Swamp Tatars cluster together with Uralics but other Siberian Tatars cluster with Bashkirs and Turkmen.

https://i.ibb.co/LQxRf5Z/vahaduo.png

Zanzibar
03-03-2021, 02:00 PM
Ok, they were only included on Lucas's website but not here: https://eurogenes.blogspot.com/2019/07/getting-most-out-of-global25_12.html.

I used this to calculate population averages for Forest_Nentsi and Tundra_Nentsi, and I got the same result as you:

sed 's/:[^,]*//'|awk -F, '{n[$1]++;for(i=2;i<=NF;i++){a[$1][i]+=$i}}END{for(i in a){o=i;for(j=2;j<=NF;j++)o=o","a[i][j]/n[i];print o}}'



Yeah or RUS_Lokomotiv_N. It's extremely close to Shamanka:

Distance to: RUS_Lokomotiv_N
.014 RUS_Shamanka_N
.020 RUS_Fofonovo_En
.027 RUS_Baikal_BA_o
.031 MNG_North_N
.033 MNG_Slab_Grave_EIA_1
.034 RUS_Yakutia_Meso
.036 MNG_EIA_3
.046 RUS_Yakutia_N



I tried using TUR_Pinarbasi_HG instead of TUR_Barcin_N. It increased the average distance by about .002 compared to TUR_Barcin_N, but I think it's cooler to use HGs.

I got about .001 lower average distance with NOR_N_HG than with Baltic_LVA_HG. If I included both in sources, Vahaduo only used NOR_N_HG in the MULTI tab.

MNG_SHU002:SHU002 is dated to about about 1212 CE. It gave me a lower average distance than older ancient sources like RUS_Shamanka_N or RUS_Lokomotiv_N. (Which makes sense because the Turkic expansion did not take place in the Neolitihic.)

Finnics had almost the same amount of GEO_CHG as VURers. When I tried replacing GEO_CHG with RUS_Maykop, the percentage in Finnics became clearly lower than in VURers.

The proportion of MNG_SHU002 (Mongoloid source of Turkic peoples) was 14% in Turkish_Southwest and 29% in Turkmen. Both of them also had fairly high WSHG.

Swamp Tatars still have about 3 times higher RUS_Krasnoyarsk_BA (kra001) than MNG_SHU002. Also Swamp Tatars cluster together with Uralics but other Siberian Tatars cluster with Bashkirs and Turkmen.

https://i.ibb.co/LQxRf5Z/vahaduo.png

Actually I have noticed that it seems that Pinarbasi_HG slightly inflated the amount of EEF among Uralics so its better to utilize Barcin_N as a proxy instead. Also try adding Baltic Hunter Gatherers like Baltic_LVA_HG and ancient Uralics like RUS_Bolshoy_Oleni_Ostrov_o, it might help decrease the EEF.

Btw are you shocked that even Mari, Saami, Bashkir, Khanty, Mansi have higher EEF and CHG than expected? I believe the RUS_Maykop or CHG in Uralics come from Steppe admixture. Steppe peoples like Yamnaya were genetically half EHG and half wog (CHG) with some minor EEF wog as well.

I also forget to mention that there seem to be new Khants samples as well which were recently added. They also haven't been average, so I decided to average them. Can you also included them also into your run?



Khants,0.088877,-0.171285917,0.11184775,0.06112775,-0.079860917,-0.02424025,0.009772333,0.01648,-0.000238667,-0.0302815,0.051355333,-0.00370925,0.02143175,-0.069935333,-0.025832167,-0.01510425,-0.001075583,0.001224667,-0.003006333,-0.010702917,-0.002682667,0.019660833,0.021322,-0.0015565,-0.006196917


Must mentioned though that the Mari in G25 seems to suffers from an extreme genetic drift that's why the distance fit is pretty terrible.

Can you try running these Saami and Mari individuals?



Saami:GS000035025,0.103579,-0.04773,0.121433,0.083657,-0.018773,0.007251,0.005405,0.011307,0.00225,-0.033896,0.028093,-0.008243,0.021853,-0.017065,-0.008143,-0.008353,-0.015646,0.00114,-0.002137,-5e-04,0.019341,-0.003462,-0.006655,-0.001325,0.001557
Saami:GS000035026,0.112685,-0.019295,0.112759,0.077197,0.000615,0.003068,0.009 4,0.010615,0.007976,-0.032438,0.022572,-0.007194,0.016799,-0.018166,-0.007872,0.005569,0.011343,0.001267,-0.005154,0.006003,0.020713,0.001607,-0.002711,-0.000723,0.000239
Saami:Saami001,0.118376,-0.005078,0.10484,0.077197,-0.001846,0.011156,0.01034,0.013615,-0.002045,-0.032985,0.018837,-0.008692,0.021407,-0.007432,-0.003664,-0.003182,-0.005476,-0.002154,-0.005028,0.008754,0.019466,0.000989,-0.008011,0.003615,-0.003712
Saami:saami1,0.103579,-0.035544,0.111251,0.077197,-0.010463,0.00753,0.01034,0.017076,0.004295,-0.034625,0.02517,-0.01139,0.021556,-0.015414,-0.008686,-0.008221,-0.0103,-0.000507,-0.00729,-0.001626,0.019091,0.000989,-0.004314,-0.001687,-0.002395
Saami:saami11,0.110408,-0.052808,0.113136,0.078166,-0.015387,0.008088,0.008695,0.014076,0.005727,-0.030433,0.031991,-0.003147,0.020069,-0.026974,-0.003122,0.001989,0.004042,-0.00038,-0.008547,-0.002501,0.017968,0.001978,0,0.004458,0.00467
Saami:saami12,0.10927,-0.01828,0.111251,0.079135,-0.01231,0.009482,0.011516,0.016615,-0.000409,-0.032074,0.024033,-0.01079,0.016501,-0.015964,-0.009636,0.01074,0.00665,-0.000633,-0.010307,0.01063,0.016471,0.004822,-0.00419,-0.000361,0.001197
Saami:saami13,0.114961,-0.031481,0.110496,0.083334,-0.00954,0.009761,0.012456,0.014999,0.003272,-0.033167,0.019649,-0.01139,0.011596,-0.024772,-0.006786,0.009679,0.00665,-0.002407,-0.003142,0.004377,0.012603,-0.001484,0.003328,0.006868,-0.003353
Saami:saami14,0.113823,-0.037575,0.106725,0.084626,-0.006155,0.006414,0.00423,0.011076,0.001227,-0.035354,0.028418,-0.008842,0.021258,-0.020506,-0.004479,0.004243,0.004824,-0.004054,-0.006159,0.006753,0.010606,0.000371,0.000246,0.008 796,-0.001796
Saami:saami2,0.108132,-0.052808,0.11653,0.078489,-0.022466,0.005578,0.005875,0.008307,0.006545,-0.03262,0.020948,-0.006294,0.017542,-0.020506,-0.001357,-0.008618,-0.014212,-0.00076,-0.003142,0.008879,0.015722,0.006554,-0.000246,-0.002048,0.00012
Saami:saami3,0.111547,-0.027419,0.108611,0.076874,-0.011079,0.004462,0.01081,0.018692,0.003886,-0.034625,0.026956,-0.006594,0.019029,-0.023946,-0.00475,0.01074,0.014733,-0.002787,-0.002765,0.004127,0.008111,0.001237,-0.000986,0.004097,0.000359
Saami:saami4,0.106994,-0.004062,0.115776,0.087533,-0.003077,0.01255,0.006345,0.020538,0.003068,-0.031527,0.018025,-0.005845,0.017988,-0.016377,-0.004479,0.009812,0.009648,-0.010642,-0.01345,0.006003,0.021712,-0.001113,-0.002835,0.001205,0.004311
Saami:saami5,0.105855,-0.035544,0.112382,0.077843,-0.006463,0.00753,0.00517,0.010846,0.005727,-0.026242,0.02176,-0.008093,0.01888,-0.013625,0.005157,-0.013392,-0.028815,0.003927,-0.002011,-0.002501,0.021462,0.000495,-0.001109,-0.001807,0.00479
Saami:saami6,0.117238,-0.01828,0.107479,0.07752,-0.002154,0.016455,0.00705,0.016845,0.003068,-0.026242,0.0177,-0.005245,0.017096,-0.02257,-0.000679,0.011138,0.009779,0.00114,-0.00088,0.005503,0.006114,0.000124,-0.000616,0.010001,0.001676
Saami:saami7,0.112685,-0.031481,0.115776,0.083334,-0.012925,0.005857,0.004935,0.007615,-0.0045,-0.035718,0.022572,-0.006744,0.012636,-0.028488,-0.002714,0.008353,0.011213,-0.002027,-0.010056,0,0.012977,0.00272,-0.006162,0.000361,-0.003832
Saami:saami8,0.113823,-0.037575,0.108611,0.078812,-0.008617,0.005299,0.005875,0.01523,0.0045,-0.033349,0.013153,-0.004946,0.017988,-0.021194,-0.001629,0.002254,0.006258,-0.003674,-0.007165,0.009505,0.009982,0.003586,-0.001109,0.00976,0.003353
Saami:saami9,0.117238,-0.028435,0.110496,0.075582,-0.008617,0.012271,0.011751,0.014307,0.002863,-0.028247,0.016726,-0.006145,0.016947,-0.024222,-0.001221,0.010342,0.007823,-0.001267,-0.008547,0.008504,0.020713,0.002102,-0.005669,0.001325,-0.001676





Mari:GRC11056593,0.10927,-0.034528,0.09164,0.063631,-0.024312,-0.004183,0.009165,0.01823,-0.00409,-0.036994,0.025982,-0.013488,0.025421,-0.037296,-0.038002,-0.029037,-0.003912,-0.006208,-0.037835,-0.037018,0.006489,0.008408,-0.038823,0.010122,0.004311
Mari:GRC11056594,0.106994,-0.046714,0.093149,0.072675,-0.02739,0.000279,0.010105,0.010846,-0.005522,-0.035718,0.020461,-0.014237,0.031665,-0.046379,-0.039223,-0.02254,0.013821,-0.017483,-0.046131,-0.038518,0.013351,0.002968,-0.05349,0.009399,0.000239
Mari:GRC11056598,0.101303,-0.041637,0.090132,0.057817,-0.02739,0.008367,0.013866,0.013153,-0.010022,-0.046106,0.023709,-0.009591,0.030921,-0.035644,-0.029044,-0.024662,-0.006389,-0.012669,-0.037458,-0.034767,0.009483,0.003091,-0.046588,-0.003615,-0.00455
Mari:GRC11056599,0.097888,-0.052808,0.101823,0.065892,-0.031083,0.002789,0.018331,0.01823,-0.003477,-0.040456,0.024521,-0.009142,0.031367,-0.032479,-0.037052,-0.013922,0.012256,-0.008361,-0.042486,-0.026888,0.017469,0.001237,-0.042274,0.018677,-0.002634
Mari:mari1,0.105855,-0.060932,0.094657,0.059755,-0.029236,0.000279,0.011986,0.016615,-0.007363,-0.039545,0.031179,-0.013638,0.033003,-0.040048,-0.030537,-0.028772,-0.012647,-0.009755,-0.0406,-0.033016,0.014974,0.00507,-0.043876,0.008676,-0.005149
Mari:mari2,0.097888,-0.043668,0.083721,0.061047,-0.032929,-0.002789,0.011751,0.017076,0.001636,-0.040456,0.029392,-0.014687,0.034192,-0.035231,-0.035423,-0.027844,-0.009257,-0.010642,-0.038966,-0.030765,0.017594,0.003957,-0.054968,0.015544,0.000838
Mari:mari3,0.105855,-0.053823,0.096166,0.059109,-0.026159,0.005299,0.009635,0.018692,-0.007363,-0.038634,0.025982,-0.013638,0.0333,-0.038121,-0.032166,-0.02254,-0.013821,-0.007221,-0.037709,-0.039519,0.01123,0.009274,-0.047327,0.010845,0.002036
Mari:mari4,0.099026,-0.04773,0.097297,0.067507,-0.032006,-0.000837,0.011986,0.020538,0.000614,-0.036812,0.02858,-0.014687,0.030327,-0.039773,-0.038409,-0.022673,0.009909,-0.004181,-0.039469,-0.025512,0.018218,0.004204,-0.051394,0.008917,-0.00491
Mari:mari5,0.097888,-0.04773,0.092395,0.060078,-0.028005,0.001394,0.012456,0.019153,-0.006136,-0.041185,0.026307,-0.019483,0.041476,-0.044177,-0.037866,-0.016971,0.000913,-0.008361,-0.037081,-0.029889,0.01959,0.003957,-0.055831,0.010363,-0.003712

Lemminkäinen
03-03-2021, 02:50 PM
Joqool, your results look interesting and decent. There are some results which will receive undeserved criticism from the Finnish side though.

Komintasavalta
03-03-2021, 02:51 PM
I got bad fits for the new Mari samples as well. (The new samples have GRC in the name. The five samples with names like Mari:mari1 are part of the regular G25 datasheet.)

I still got a .002 lower average distance with NOR_N_HG than with Baltic_LVA_HG, but it's probably because there's so many Saami. Maris still get a lot of Samara_HG and very little NOR_N_HG.

The distance between new and old Khanty was .014. The distance between the new and old Maris was also only .018.

https://i.ibb.co/mN4v8Ky/joqool5.png

I also tried creating a model optimized just for Maris. I first added all lines from the ancient averages datasheet to the source tab. Then I made a new model using just the 6 sources with the highest average percentage. The average distance was still .082, but it's insane how much Levänluhta Maris get.

https://i.ibb.co/MnNNHRq/joqool6.png

When I removed Levänluhta from the sources, its place was taken by Chalmny Varre. When I also removed Chalmny Varre, the main components in Maris turned into RUS_Krasnoyarsk_BA and Yamnaya_UKR.

https://i.ibb.co/HXBFLZW/joqool7.png

Lemminkäinen
03-03-2021, 03:29 PM
People have difficulties to accept at least two obvious results

- the BOO samples are a mixture of Baltic and Siberian people. Was it the way how N1c1 was linked to the Baltic region near Volga? This model explains how it happened. Ancient west VUR's carrying R1a and representing Baltic-like ancestry mixed with Siberians carrying N1c1.

- the Finns are mixed ancient Scandinavians, those west VUR's and Saamis. Recent Finnish researchers don't like this obvious thing. They have not yet revealed the sample set from the ancient Eura. Those samples were scanned in Germany 2 years ago and only yDna was published. We know that yDna is really hard to determine compared to autosomal dna. Researchers usually proudly publish data they have. Maybe the reason is that Finnish geneticists named their project as Finno-Ugric genetic project and it would be unpleasant to reveal that those samples were only 10-20% Finno-Ugric and they supposed them to be 100%.

Leto
03-03-2021, 03:34 PM
What's with this "VUR" bullshit? Why did you all start using this term?

Zanzibar
03-03-2021, 03:36 PM
I got bad fits for the new Mari samples as well. (The new samples have GRC in the name. The five samples with names like Mari:mari1 are part of the regular G25 datasheet.)

I still got a .002 lower average distance with NOR_N_HG than with Baltic_LVA_HG, but it's probably because there's so many Saami. Maris still get a lot of Samara_HG and very little NOR_N_HG.

The distance between new and old Khanty was .014. The distance between the new and old Maris was also only .018.

https://i.ibb.co/mN4v8Ky/joqool5.png

I also tried creating a model optimized just for Maris. I first added all lines from the ancient averages datasheet to the source tab. Then I made a new model using just the 6 sources with the highest average percentage. The average distance was still .082, but it's insane how much Levänluhta Maris get.

https://i.ibb.co/MnNNHRq/joqool6.png

When I removed Levänluhta from the sources, its place was taken by Chalmny Varre. When I also removed Chalmny Varre, the main components in Maris turned into RUS_Krasnoyarsk_BA and Yamnaya_UKR.

https://i.ibb.co/HXBFLZW/joqool7.png

Interesting. Because I can get the EEF even lower than that for the Saami individuals but I also used Yamnaya_Mereke which has very little EEF compared to other Yamnaya. Makes me wonder if Yamnaya_Mereke is absorbing some Neolithic or not. I also used RUS_Progress_Eneolithic which is the ancestor of Steppe invaders with almost zero EEF and it seems to lower the Barcin_N score as well.

Btw its strange to see the Mari having so much EEF. They should have lower than the Saami tbh as they are even more Mongoloid at 30% at least. I guess its because of the genetic drift of the Mari samples. If we can get rid of the drift, maybe the Neolithic score of Maris would be lower than what they are scoring right now in G25.

In this spreadsheet, the Saami are somewhere from 14-16% EEF on average while Udmurts are 15.6-18% EEF (checked the Modern and Modern2 sheets), unfortunately there are no Maris here but they should be lower than the Saami as they are more Mong: https://docs.google.com/spreadsheets/d/1LPWAEC3dbAEDu8aBAAcxIOa5CQjuflt0f0cvhCpZ_ME/edit#gid=1248753915

Are you shocked btw that even Khanty, Mansi and Nenets have higher EEF than you expected? I am also surprised tbh as I didn't think that wogs would spread almost everywhere in Uralic lands. It also strange that the RUS_Maykop of Maris is higher than their GEO_CHG score when both of them should be mainly CHG-derived.

I should learn qpAdm so I will get a better insight of Uralic genetics and might be able to get a great distance for the Mari while also removing the drift.

Komintasavalta
03-03-2021, 03:52 PM
I think Finns get such a high percentage of the Neolithic component in my model because there is no steppe component. Yamnaya_UKR is modeled as about 30% TUR_Barcin_N too:

https://i.ibb.co/8jBJmwS/joqool8.png

However if I include GEO_CHG, Yamnaya_UKR gets about three times more of GEO_CHG than TUR_Barcin_N, and also its RUS_Tyumen_HG is reduced to almost zero.

https://i.ibb.co/843Ss9p/joqool9.png

Lemminkäinen
03-03-2021, 04:02 PM
What's with this "VUR" bullshit? Why did you all start using this term?

Volga-Uralic, I assume.

Leto
03-03-2021, 04:11 PM
Volga-Uralic, I assume.
I'm smart enough to understand that, sir ;)

Is it true that Western Finns came to Finland from Estonia 2,000 years ago while Eastern Finns migrated there from Lake Ladoga 4,000 years ago? I've recently happened to read such a comment in Russian.

Zanzibar
03-03-2021, 04:15 PM
I think Finns get such a high percentage of the Neolithic component in my model because there is no steppe component. Yamnaya_UKR is modeled as about 30% TUR_Barcin_N too:

https://i.ibb.co/8jBJmwS/joqool8.png

However if I include GEO_CHG, Yamnaya_UKR gets about three times more of GEO_CHG than TUR_Barcin_N, and also its RUS_Tyumen_HG is reduced to almost zero.

https://i.ibb.co/843Ss9p/joqool9.png

That's because Yamnaya like most other Steppe populations are mainly a mixture between EHG and CHG. If you add any Steppe component, it also decreases the EEF score as Steppe groups have some EEF as well. The best proxy to use is RUS_Progress_Eneolithic or maybe RUS_Vonyuchka_Eneolithic, they seem to have like zero to almost zero EEF and seems to be the predecessors of the Yamnaya, Sintashta and other Steppe. However, Progress_En and Vonyuchka_En is also mainly a mix of EHG+CHG as well, just hardly any Neolithic at all.

Can you try adding Progress_Eneolithic, Vonyuchka_Eneolithic, Yamnaya_Mereke (they have very little EEF like only 2%),VK_NOR_LN_HG or RUS_Volga_Kama or RUS_Veretye_Meso to your runs? It should decreases the Neolithic but will also eats almost all of the GEO_CHG.

Komintasavalta
03-03-2021, 04:45 PM
Ancient west VUR's

The demonym for the people of the VUR is VURian or VURer.


What's with this "VUR" bullshit? Why did you all start using this term?

I think they got it from me, and I maybe first saw it in some paper like Tambets et al. 2018.

https://i.ibb.co/J5yvZ8L/20210303190313.jpg


I also used Yamnaya_Mereke which has very little EEF compared to other Yamnaya

It's still 24% in my first model on the left, even though the average distance is crap. Adding GEO_CHG improves the average distance a lot, but then the combined proportion of TUR_Barchin_N and GEO_CHG is still 37% for Yamnaya_KAZ_Mereke.

https://i.ibb.co/k5PYFr5/b.png

Here's the combined percentage of TUR_Barchin_N and GEO_CHG in the second model above:

64.6 Yamnaya_UKR_Ozera_o
62.0 Yamnaya_BGR
44.6 Yamnaya_RUS_Caucasus
44.4 RUS_Progress_En
41.8 Yamnaya_KAZ_Karagash
41.4 Yamnaya_UKR
40.8 RUS_Afanasievo
40.4 Yamnaya_RUS_Kalmykia
40.0 Yamnaya_RUS_Samara
37.0 Yamnaya_KAZ_Mereke

I didn't realize that Yamnaya had this much churka/wog ancestry.


RUS_Progress_Eneolithic which is the ancestor of Steppe invaders

Yeah, when I tried to model Yamnaya by putting about 200 mostly neolithic and older population averages in the sources, it was first modeled as mostly RUS_Afanasievo, but then when I removed Afanasievo, RUS_Progress_En became the main component.


In this spreadsheet, the Saami are somewhere from 14-16% EEF on average while Udmurts are 15.6-18% EEF (checked the Modern and Modern2 sheets), unfortunately there are no Maris here but they should be lower than the Saami as they are more Mong: https://docs.google.com/spreadsheets/d/1LPWAEC3dbAEDu8aBAAcxIOa5CQjuflt0f0cvhCpZ_ME/edit#gid=1248753915

Yeah but in the spreadsheet, Udmurts have 13% CHG and Saami have 7%.

This nMonte model for GEO_CHG has shit distance, but you can tell that CHG is a churka component:

$ nm g/25/ma <(aag chg)
Target: GEO_CHG (d=.127)
50.6 Abkhasian
21.0 Ossetian
9.8 Georgian_Imer
9.8 Kubachinian
4.6 Darginian
1.6 Kalash
0.4 Brahui
0.4 Makrani
0.2 Balochi
0.2 Brahmin_Uttar_Pradesh
0.2 Chechen
0.2 Gujar_India
0.2 Gujar_Pakistan
0.2 Lak
0.2 Manyika
0.2 North_Ossetian
0.2 Pashtun

(`nm` is my shell wrapper for nMonte3.R, and `aag` is a function that greps ancient averages.)

Lemminkäinen
03-03-2021, 04:48 PM
I'm smart enough to understand that, sir ;)

Is it true that Western Finns came to Finland from Estonia 2,000 years ago while Eastern Finns migrated there from Lake Ladoga 4,000 years ago? I've recently happened to read such a comment in Russian.

Partly true. Baltic Finnic speakers came from Estonia to Southwest Finland around 1700 ybp.

Eastern Finns didn't exist earlier than 600 ybp. East Finns are a product of the Swedish king Gustav Vasa. The Finnish tribe called Karelians, known by Russians, was born 1300-1400 years ago on the western coast of the lake Ladogan and has Finnish roots. During the Swedish third crusader the Karelians were split into Roman Catholic and Orthodox parts and the Catholic part was the genesis of East Finns.

Most Russian stories about Finns are fairy tale.

Leto
03-03-2021, 05:02 PM
Partly true. Baltic Finnic speakers came from Estonia to Southwest Finland around 1700 ybp.

Eastern Finns didn't exist earlier than 600 ybp. East Finns are a product of the Swedish king Gustav Vasa. The Finnish tribe called Karelians, known by Russians, was born 1300-1400 years ago on the western coast of the lake Ladogan and has Finnish roots. During the Swedish third crusader the Karelians were split into Roman Catholic and Orthodox parts and the Catholic part was the genesis of East Finns.

Most Russian stories about Finns are fairy tale.
I see. But that comment actually referred to a Finnish study or something, so cool down your Russophobia :swl

Lemminkäinen
03-03-2021, 05:12 PM
I see. But that comment actually referred to a Finnish study or something, so cool down your Russophobia :swl

So show me the Finnish source.

I only stated what is true. I have never mocked Russian people, only their history writing. I am a friend of all Russians. Is it true that a few years ago Putin organized a new work to correct wrong history writings in Russian school books. Oh, yes!

https://www.nbcnews.com/news/world/vladimir-putin-rewriting-russias-history-books-flna2D11669160

This would never be seen in Finland

https://www.nbcnews.com/news/world/vladimir-putin-rewriting-russias-history-books-flna2D11669160

Lemminkäinen
03-03-2021, 05:18 PM
Komintasavalta

The demonym for the people of the VUR is VURian or VURer.



Ok.

Komintasavalta
03-03-2021, 05:23 PM
Apparently we can get a fairly good fit for Yamnaya_UKR by just using CHG and Volga-Kama_N (which is close to EHGs):

Target: Yamnaya_UKR (d=.056)
61.2 RUS_Volga-Kama_N
38.8 GEO_CHG

The fits using Karelia_HG or Samara_HG are worse though:

Target: Yamnaya_UKR (d=.067)
57.2 RUS_Karelia_HG
42.8 GEO_CHG

Target: Yamnaya_UKR (d=.069)
60.8 RUS_Samara_HG
39.2 GEO_CHG

I tried calculating two-way distances for Yamnaya_UKR using some of the closest rows to Karelia_HG and the closest rows to GEO_CHG. The best fit was this:

.044 72.6% RUS_Khvalynsk_En + 27.4% Kura-Araxes_ARM_Kalavan

Volga-Kama_N was the next population after RUS_Khvalynsk_En on the non-churka side. Its best fits were these:

.054 57.4% RUS_Volga-Kama_N + 42.6% Kura-Araxes_RUS_Velikent
.055 59.2% RUS_Volga-Kama_N + 40.8% RUS_Darkveti-Meshoko_En
.056 61.2% RUS_Volga-Kama_N + 38.8% GEO_CHG
.057 59.4% RUS_Volga-Kama_N + 40.6% Kura-Araxes_ARM_Kalavan
.057 59.6% RUS_Volga-Kama_N + 40.4% RUS_Maykop_Novosvobodnaya
.057 59.6% RUS_Volga-Kama_N + 40.4% RUS_North_Caucasus_MBA

Best fit for Samara_HG:

.060 58.6% RUS_Samara_HG + 41.4% Kura-Araxes_ARM_Kalavan

Best fit for Karelia_HG:

.061 44.8% RUS_Darkveti-Meshoko_En + 55.2% RUS_Karelia_HG

Leto
03-03-2021, 05:30 PM
So show me the Finnish source.

I only stated what is true. I have never mocked Russian people, only their history writing. I am a friend of all Russians. Is it true that a few years ago Putin organized a new work to correct wrong history writings in Russian school books. Oh, yes!

https://www.nbcnews.com/news/world/vladimir-putin-rewriting-russias-history-books-flna2D11669160

This would never be seen in Finland

https://www.nbcnews.com/news/world/vladimir-putin-rewriting-russias-history-books-flna2D11669160
I don't know about that and I don't particularly care about Finnish history either. In Russia Finland is not seen as a significant country, just a land of lakes, forests and reindeers. But I'm pretty sure the Finns would also cover some events as they like, different from the way other nations would do it. That's a common thing in the world. After all, one person's or nation's truth may be another person's or nation's falsehood. As for the source, I don't have it because I only saw one comment. Pretty sure that person had no ill intentions because it was directed at a woman who claimed actual Finnish roots.

Lemminkäinen
03-03-2021, 05:47 PM
I don't know about that and I don't particularly care about Finnish history either. In Russia Finland is not seen as a significant country, just a land of lakes, forests and reindeers. But I'm pretty sure the Finns would also cover some events as they like, different from the way other nations would do it. That's a common thing in the world. After all, one person's or nation's truth may be another person's or nation's falsehood. As for the source, I don't have it because I only saw one comment. Pretty sure that person had no ill intentions because it was directed at a woman who claimed actual Finnish roots.

Have I blamed somebody about ill intentions? I only stated that Russians have a Russian history writing. Did I ask about it? It was pretty much like you asked and I answered.

Zanzibar
03-03-2021, 05:52 PM
The demonym for the people of the VUR is VURian or VURer.



I think they got it from me, and I maybe first saw it in some paper like Tambets et al. 2018.

https://i.ibb.co/J5yvZ8L/20210303190313.jpg



It's still 24% in my first model on the left, even though the average distance is crap. Adding GEO_CHG improves the average distance a lot, but then the combined proportion of TUR_Barchin_N and GEO_CHG is still 37% for Yamnaya_KAZ_Mereke.

https://i.ibb.co/k5PYFr5/b.png

Here's the combined percentage of TUR_Barchin_N and GEO_CHG in the second model above:

64.6 Yamnaya_UKR_Ozera_o
62.0 Yamnaya_BGR
44.6 Yamnaya_RUS_Caucasus
44.4 RUS_Progress_En
41.8 Yamnaya_KAZ_Karagash
41.4 Yamnaya_UKR
40.8 RUS_Afanasievo
40.4 Yamnaya_RUS_Kalmykia
40.0 Yamnaya_RUS_Samara
37.0 Yamnaya_KAZ_Mereke

I didn't realize that Yamnaya had this much churka/wog ancestry.



Yeah, when I tried to model Yamnaya by putting about 200 mostly neolithic and older population averages in the sources, it was first modeled as mostly RUS_Afanasievo, but then when I removed Afanasievo, RUS_Progress_En became the main component.



Yeah but in the spreadsheet, Udmurts have 13% CHG and Saami have 7%.

This nMonte model for GEO_CHG has shit distance, but you can tell that CHG is a churka component:

$ nm g/25/ma <(aag chg)
Target: GEO_CHG (d=.127)
50.6 Abkhasian
21.0 Ossetian
9.8 Georgian_Imer
9.8 Kubachinian
4.6 Darginian
1.6 Kalash
0.4 Brahui
0.4 Makrani
0.2 Balochi
0.2 Brahmin_Uttar_Pradesh
0.2 Chechen
0.2 Gujar_India
0.2 Gujar_Pakistan
0.2 Lak
0.2 Manyika
0.2 North_Ossetian
0.2 Pashtun

(`nm` is my shell wrapper for nMonte3.R, and `aag` is a function that greps ancient averages.)

Yup Yamnaya and other Steppe peoples have lots of wog/churka blood. Interesting that Yamnaya_Mereke has around 5% Neolithic in your model. Thought it would be like 2% from my previous runs.

Don't know much about Afanasievo but it seems to be another Steppe group that still has around 10% EEF. Well Progress_Eneolithic is literally a mix of churka/CHG+EHG with little to no Neolithic.

It still seems like Saamis have the lowest wog/churka admix among Europeans in that spreadsheet if you combine EEF+CHG together. Udmurts have more but still a lot less than other Euros. It sucks there are no Maris in the spreadsheet. Want to see if they will have lower CHG+EEF than Saamis or not. Bashkirs probably somewhere between Saamis and Udmurts. Khanty, Mansi definitely have lower churka than Saamis though.

Leto
03-03-2021, 06:05 PM
Have I blamed somebody about ill intentions? I only stated that Russians have a Russian history writing. Did I ask about it? It was pretty much like you asked and I answered.
Correcting history books is not wrong per se. As if there are no countless books in the West that distort Russian history or paint it in extremely negative tones.
Yes, thank you for the answer. I'm interested in that only because it's relevant to Russian genetics which I've been exploring for myself. Many Russians would get half Polish half Finnish in oracles.

Lemminkäinen
03-03-2021, 06:42 PM
Correcting history books is not wrong per se. As if there are no countless books in the West that distort Russian history or paint it in extremely negative tones.


Why some Russians expect that westerners distort Russian history, but Russians are free of this attitude? Western researchers (historians) do their work without bad intentions, but of course the research is an endless way of iterations. I referred Putin as a politician. They do lie, also in the west. The Finnish history writing has been unchanged for a long time and only details have changed. Genetics is a different case, because we have no more that about 10 years research and it is not yet stable enough to lead to consensus. At least in Finland the tradition of population genetics is weak due to our poor sample collection. Our archaeological tradition is much stronger. In case of East Finnish history we don't need even archaeology, we only need to read original Swedish documents.







Yes, thank you for the answer. I'm interested in that only because it's relevant to Russian genetics which I've been exploring for myself. Many Russians would get half Polish half Finnish in oracles.

The Baltic Finnic influence was strong in NW Russia already before Novgorodians. Tsuhnas or how you called them. There was Baltic Finnic migrations from south to north, from north to south, from east to west and from west to east. It was more like a fluctuating BF gene pool rather than a one-way migration from place to another. Tsuhnas lived in a large region from Sweden to Russia. Every attempt to locate them to Finland, Russia or Estonia makes no sense.

Komintasavalta
03-03-2021, 07:22 PM
Well Progress_Eneolithic is literally a mix of churka/CHG+EHG with little to no Neolithic.

Yeah, it seems like half EHG and half CHG:

Distance to: RUS_Progress_En
.052 47.8% GEO_CHG + 52.2% RUS_Karelia_HG
.053 55.4% RUS_Samara_HG + 44.6% GEO_CHG
.055 54.6% RUS_Volga-Kama_N + 45.4% GEO_CHG


It sucks there are no Maris in the spreadsheet. Want to see if they will have lower CHG+EEF than Saamis or not. Bashkirs probably somewhere between Saamis and Udmurts. Khanty, Mansi definitely have lower churka than Saamis though.

In the model below, the combined proportion of the first two components is 4 percentage points lower in Saami than in Mari:

7.8 Nenets
11.2 Khanty
12.8 Mansi
16.8 Tatar_Siberian_Zabolotniye
22.8 Saami
26.8 Mari
27.4 Bashkir
28.8 Saami_Kola
31.0 Udmurt
33.2 Chuvash
35.2 Komi
35.4 Vepsian
35.6 Karelian
38.8 Finnish
39.6 Tatar_Kazan
41.0 Tatar_Mishar

https://i.ibb.co/4jsdZ2w/marisaami3.png

Zanzibar
03-03-2021, 07:40 PM
Yeah, it seems like half EHG and half CHG:

Distance to: RUS_Progress_En
.052 47.8% GEO_CHG + 52.2% RUS_Karelia_HG
.053 55.4% RUS_Samara_HG + 44.6% GEO_CHG
.055 54.6% RUS_Volga-Kama_N + 45.4% GEO_CHG



In the model below, the combined proportion of the first two components is 4 percentage points lower in Saami than in Mari:

7.8 Nenets
11.2 Khanty
12.8 Mansi
16.8 Tatar_Siberian_Zabolotniye
22.8 Saami
26.8 Mari
27.4 Bashkir
28.8 Saami_Kola
31.0 Udmurt
33.2 Chuvash
35.2 Komi
35.4 Vepsian
35.6 Karelian
38.8 Finnish
39.6 Tatar_Kazan
41.0 Tatar_Mishar

https://i.ibb.co/4jsdZ2w/marisaami3.png

Interesting. Although keep in mind that the Mari are extremely genetically drifted in G25 so the results could be a bit inaccurate. If we can remove the drift, the two wog components might a bit lower or similar levels to the Saami. Anyway I believe its cause the Saamis are located more isolated in Far North Scandinavia which makes it harder for additional wog blood from mainland Russia to get to them unlike what happen to most VURers.

I think Progress_En is the best for measuring actual amount of churka in Euros as they lack the EEF. I would guess Selkup would have similar amounts of churka/wog to Nenets? It sucks that there are so much wog (CHG+Anatolian) blood among Uralics. It must be nice if they were still purely indigenous HG+Siberian without Anatolian and CHG.

Actually VK2020_NOR_North_LN_HG inflated Anatolian a bit, if you add RUS_Volga_Kama and Bolshoy_Oleni_Ostrov_o again, it might lower EEF a bit.

Can you do the same models for the Saami, Mari, Besermyan, Saami_Kola and Chuvash individuals? (try removing the VK2020_NOR_LN_HG and replace it with Baltic_LVA_MN instead, the EEF should go down).

Komintasavalta
03-04-2021, 09:10 PM
Although keep in mind that the Mari are extremely genetically drifted in G25 so the results could be a bit inaccurate.

We can get an extremely good fit for Chuvashes by modeling them as 60% Mari and 40% Mishar (d=.009) or 70% Mari and 30% Mordovian (d=.013). Doesn't it mean that even if Maris are drifted, their drift is shared by Chuvashes?

Like Maris, Chuvashes are also far from every other population average in G25:

Distance to: Chuvash
.048 Besermyan
.049 Udmurt
.056 Mari
.064 Tatar_Kazan
.069 FIN_Levanluhta_IA
.071 Saami
.072 RUS_Chalmny-Varre
.073 Komi
.077 Saami_Kola
.086 VK2020_NOR_North_VA_o2
.087 Tatar_Mishar
.090 RUS_Mezhovskaya
.092 Tatar_Lipka
.096 MDA_Cimmerian
.097 RUS_Tagar
.097 VK2020_NOR_North_VA_o1

Therefore it is surprising that my two-way models for Chuvashes have such good fit.


Actually VK2020_NOR_North_LN_HG inflated Anatolian a bit, if you add RUS_Volga_Kama and Bolshoy_Oleni_Ostrov_o again, it might lower EEF a bit.

Can you do the same models for the Saami, Mari, Besermyan, Saami_Kola and Chuvash individuals? (try removing the VK2020_NOR_LN_HG and replace it with Baltic_LVA_MN instead, the EEF should go down).

In the model below, if I changed Baltic_LVA_MN into NOR_N_HG, it reduced the average distance from .0567 to .0561. The combined average proportion of Barcin and CHG stayed as 28.1% (from 21.3 Barcin + 6.8% CHG to 20.9% Barcin + 7.2% CHG).

https://i.ibb.co/tBZrmL9/joqool14.png

Above some Saami individuals randomly have 0% RUS_Bolshoy_Oleni_Ostrov_o but others have 10-20%. Also one Chuvash has 42% RUS_Bolshoy_Oleni_Ostrov_o but another has 13%. The individuals with lower BOO_o have higher EHG (Volga-Kama_N) and higher kra001 (Krasnoyarsk_BA).

RUS_Bolshoy_Oleni_Ostrov_o is just half EHG and half RUS_Krasnoyarsk_BA, so it's not needed as an additional component in modeling Uralics:

Target: RUS_Bolshoy_Oleni_Ostrov_o
d=.018 - 79% RUS_Bolshoy_Oleni_Ostrov + 21% RUS_Krasnoyarsk_BA
d=.029 - 47% RUS_Samara_HG + 53% RUS_Krasnoyarsk_BA
d=.034 - 48% RUS_Karelia_HG + 52% RUS_Krasnoyarsk_BA
d=.034 - 48% RUS_Sidelkino_HG + 52% RUS_Krasnoyarsk_BA
d=.035 - 47% RUS_Khvalynsk_En + 53% RUS_Krasnoyarsk_BA
d=.036 - 52% RUS_Sintashta_MLBA_o3 + 48% RUS_Krasnoyarsk_BA
d=.038 - 47% RUS_Volga-Kama_N + 53% RUS_Krasnoyarsk_BA
d=.040 - 46% RUS_Veretye_Meso + 54% RUS_Krasnoyarsk_BA


It sucks that there are so much wog (CHG+Anatolian) blood among Uralics. It must be nice if they were still purely indigenous HG+Siberian without Anatolian and CHG.

I think we're still pure enough... I'm terrified to imagine what even purer Uralics would be like.

Zanzibar
03-04-2021, 09:47 PM
We can get an extremely good fit for Chuvashes by modeling them as 60% Mari and 40% Mishar (d=.009) or 70% Mari and 30% Mordovian (d=.013). Doesn't it mean that even if Maris are drifted, their drift is shared by Chuvashes?

Like Maris, Chuvashes are also far from every other population average in G25:

Distance to: Chuvash
.048 Besermyan
.049 Udmurt
.056 Mari
.064 Tatar_Kazan
.069 FIN_Levanluhta_IA
.071 Saami
.072 RUS_Chalmny-Varre
.073 Komi
.077 Saami_Kola
.086 VK2020_NOR_North_VA_o2
.087 Tatar_Mishar
.090 RUS_Mezhovskaya
.092 Tatar_Lipka
.096 MDA_Cimmerian
.097 RUS_Tagar
.097 VK2020_NOR_North_VA_o1

Therefore it is surprising that my two-way models for Chuvashes have such good fit.



In the model below, if I changed Baltic_LVA_MN into NOR_N_HG, it reduced the average distance from .0567 to .0561. The combined average proportion of Barcin and CHG stayed as 28.1% (from 21.3 Barcin + 6.8% CHG to 20.9% Barcin + 7.2% CHG).

https://i.ibb.co/tBZrmL9/joqool14.png

Above some Saami individuals randomly have 0% RUS_Bolshoy_Oleni_Ostrov_o but others have 10-20%. Also one Chuvash has 42% RUS_Bolshoy_Oleni_Ostrov_o but another has 13%. The individuals with lower BOO_o have higher EHG (Volga-Kama_N) and higher kra001 (Krasnoyarsk_BA).

RUS_Bolshoy_Oleni_Ostrov_o is just half EHG and half RUS_Krasnoyarsk_BA, so it's not needed as an additional component in modeling Uralics:

Target: RUS_Bolshoy_Oleni_Ostrov_o
d=.018 - 79% RUS_Bolshoy_Oleni_Ostrov + 21% RUS_Krasnoyarsk_BA
d=.029 - 47% RUS_Samara_HG + 53% RUS_Krasnoyarsk_BA
d=.034 - 48% RUS_Karelia_HG + 52% RUS_Krasnoyarsk_BA
d=.034 - 48% RUS_Sidelkino_HG + 52% RUS_Krasnoyarsk_BA
d=.035 - 47% RUS_Khvalynsk_En + 53% RUS_Krasnoyarsk_BA
d=.036 - 52% RUS_Sintashta_MLBA_o3 + 48% RUS_Krasnoyarsk_BA
d=.038 - 47% RUS_Volga-Kama_N + 53% RUS_Krasnoyarsk_BA
d=.040 - 46% RUS_Veretye_Meso + 54% RUS_Krasnoyarsk_BA



I think we're still pure enough... I'm terrified to imagine what even purer Uralics would be like.

Yes it looks the Chuvash are genetically drifted in G25 as well, not as much as the Mari though.

Interesting. I managed to get around 13-14% EEF for the Saami:GS000035025 and Saami:saami2 who are the two most Mongoloid-shifted and least EEF affected modern Saami samples. But then I also include Yamnaya_KAZ_Mereke into the run which seems to hide the CHG wog score as the Yamnaya component absorbs it (Yamnaya are EHG+CHG thus they absorb any other CHG score) and could be absorbing minor EEF making the percentage of Anatolian wog blood goes down. What I have noticed is including the Yamnaya populations including Mereke seem to drastically improve the fits for Saamis and many Euros.

You are right, it is no longer needed for modeling.

Purer Uralics like Uyelgi ancient sample in G25 and Khanty, Mansi, Nenets should have less CHG and Anatolian wog contamination as they are also much more Mongoloid, thus inversely correlates to the reduction of the churka/wog affinity.

Seems like the winner of the lowest wog contamination (Anatolian+CHG) is Saami:GS000035025 individual followed by Saami:saami2. even the Mari individuals still have a bit higher wogs than these Saamis. Btw these Saamis are also around 27% East Eurasian, so maybe the Mongoloid help reduce any wog affinity? Its interesting how Saamis still have the least wog affinity compared to most other Uralics except Mansi, Khanty, Nenets and Selkup who have even much less.

Komintasavalta
03-04-2021, 10:07 PM
Actually you can also model BOO as kra001 + WSHG + Norwegian HG (intermediate between SHG and EHG) + wog:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.013)
34.8 RUS_Krasnoyarsk_BA
33.4 RUS_Sosonivoy_HG
25.8 VK2020_NOR_North_LN_HG
6.0 TUR_Barcin_N

The fit with RUS_Tyumen_HG is worse than with RUS_Sosonivoy_HG, even though Sosonivoy clusters together with Botai and Tyumen_HG:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.018)
35.6 RUS_Krasnoyarsk_BA
29.2 VK2020_NOR_North_LN_HG
29.0 RUS_Tyumen_HG
6.2 TUR_Barcin_N

The fit with Botai is almost as good:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.015)
33.4 RUS_Krasnoyarsk_BA
32.2 KAZ_Botai
29.4 VK2020_NOR_North_LN_HG
5.0 TUR_Barcin_N

Without the wog, the distance more than doubles:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.028)
36.4 RUS_Krasnoyarsk_BA
34.0 VK2020_NOR_North_LN_HG
29.6 RUS_Sosonivoy_HG


Its interesting how Saamis still have the least wog affinity compared to most other Uralics except Mansi, Khanty, Nenets and Selkup who have even much less.

Yeah and Nganasan. And one Uralic people that everyone has forgotten about are the Enets. I haven't even seen their genetic results, and I don't think even travv or me has posted a classification thread about an Enets girl.

(I just searched VKontakte for girls from the main Forest Yukaghir village, but I didn't find any good ones. Maybe I'll search for Enets girls next.)

Zanzibar
03-04-2021, 10:29 PM
Actually you can also model BOO as kra001 + WSHG + Norwegian HG (intermediate between SHG and EHG) + wog:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.013)
34.8 RUS_Krasnoyarsk_BA
33.4 RUS_Sosonivoy_HG
25.8 VK2020_NOR_North_LN_HG
6.0 TUR_Barcin_N

The fit with RUS_Tyumen_HG is worse than with RUS_Sosonivoy_HG, even though Sosonivoy clusters together with Botai and Tyumen_HG:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.018)
35.6 RUS_Krasnoyarsk_BA
29.2 VK2020_NOR_North_LN_HG
29.0 RUS_Tyumen_HG
6.2 TUR_Barcin_N

The fit with Botai is almost as good:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.015)
33.4 RUS_Krasnoyarsk_BA
32.2 KAZ_Botai
29.4 VK2020_NOR_North_LN_HG
5.0 TUR_Barcin_N

Without the wog, the distance more than doubles:

Target: RUS_Bolshoy_Oleni_Ostrov (d=.028)
36.4 RUS_Krasnoyarsk_BA
34.0 VK2020_NOR_North_LN_HG
29.6 RUS_Sosonivoy_HG



Yeah and Nganasan. And one Uralic people that everyone has forgotten about are the Enets. I haven't even seen their genetic results, and I don't think even travv or me has posted a classification thread about an Enets girl.

(I just searched VKontakte for girls from the main Forest Yukaghir village, but I didn't find any good ones. Maybe I'll search for Enets girls next.)

Yup there is minor wog blood in BOO. However the BOO outlier sample if I remembered correctly, have almost zero Neolithic contamination.

There are probably new Enets samples, I will have to find them.

Ancient Uralics like Uyelgi, I predicted would probably have wog (CHG+EEF) contamination somewhere close to the Khanty's level I believe. Have to try it out.

Can you try to run your model on these ancient Uralics? I wanted to see if they will have lower wog than Saamis:


Uyelgi1_scaled,0.086506,-0.132019,0.106725,0.086887,-0.068013,-0.027331,0.043712,0.025153,-0.027815,-0.050297,0.068528,-0.023379,0.032111,-0.041012,-0.02158,0.022275,-0.020601,-0.02065,-0.026271,-0.030014,-0.049662,0.010881,-0.009367,-0.01458,0.008622
Uyelgi2_scaled,0.106994,-0.114755,0.083721,0.088179,-0.082785,-0.063587,0.019506,-0.017076,-0.019021,-0.057222,0.041734,-0.023079,0.007433,-0.054499,-0.044381,-0.011535,0.000782,-0.022297,-0.035698,-0.033766,0.021213,0.026214,0.034879,0.010845,0.019 639
RUS_Krasnoyarsk_MLBA_o,0.102441,-0.076165,0.094657,0.094639,-0.053241,-0.001116,-0.004465,0.000462,-0.006749,-0.036994,0.025495,-0.013038,0.010704,-0.044039,0.011265,-0.000265,-0.013299,0,0.009553,-0.003502,-0.003119,0.006801,0.005423,0.00482,-0.003233
VK2020_NOR_North_VA_o1,0.106994,-0.079211,0.116153,0.076551,-0.035699,-0.00251,-0.00141,0.01223,0,-0.034443,0.031016,-0.012289,0.01888,-0.031516,-0.011536,0.002121,-0.011083,0.003927,-0.005782,-0.014007,0.013351,-0.001855,0.001109,0.002771,-0.002395
FIN_Levanluhta_IA:DA238,0.099026,-0.081242,0.117662,0.084626,-0.028313,-0.001394,-0.00094,0.009,0.001432,-0.036994,0.038648,-0.005095,0.032705,-0.016377,-0.004343,0.002784,0.012908,0.005574,-0.010684,0.004127,0.018592,0.005564,-0.003944,-0.002289,-0.001317

Komintasavalta
03-04-2021, 10:56 PM
Can you try to run your model on these ancient Uralics? I wanted to see if they will have lower wog than Saamis:

Uyelgi gets a sh*t fit in my model:

https://i.ibb.co/gWc3GdK/a.png

However when I put all lines from the ancient averages file in sources, the model that Vahaduo generated automatically for Uyelgi still had a sh*t fit:

Target: Uyelgi1_scaled (d=.102)
27.0 RUS_Krasnoyarsk_BA
22.6 RUS_Bolshoy_Oleni_Ostrov_o
15.8 RUS_Samara_HG
11.0 Baltic_EST_BA
8.6 KAZ_Mereke_MBA
8.6 Saka_Tian_Shan_o
3.0 IRN_HotuIIIb_Meso
1.2 RUS_Yana_MA
1.2 VK2020_NOR_North_VA_o1
0.8 VUT_2300BP_all
0.2 RUS_Sintashta_MLBA_o3

Target: Uyelgi2_scaled (d=.099)
34.0 RUS_Srubnaya_MLBA_o
30.6 RUS_Krasnoyarsk_BA
13.4 VK2020_NOR_North_VA_o1
9.4 DEU_MA_o
7.0 RUS_AfontovaGora3
5.6 VUT_2300BP_all

(DEU_MA_o which is the main source of wog ancestry for Uyelgi2 is similar to modern Italians.)


Yup there is minor wog blood in BOO. However the BOO outlier sample if I remembered correctly, have almost zero Neolithic contamination.

The outlier has a bit of wog too but it's just more kra001:


TargetDistanceKAZ_BotaiNOR_N_HGRUS_Krasnoyarsk_BAR US_Samara_HGTUR_Barcin_N
RUS_Bolshoy_Oleni_Ostrov.01628.419.434.413.44.4
RUS_Bolshoy_Oleni_Ostrov_o.02126.213.447.410.03.0


The outlier gets a good fit as 20% kra001 and 80% regular BOO:

Target: RUS_Bolshoy_Oleni_Ostrov_o
d=.018 - 79% RUS_Bolshoy_Oleni_Ostrov + 21% RUS_Krasnoyarsk_BA
d=.029 - 47% RUS_Samara_HG + 53% RUS_Krasnoyarsk_BA
d=.034 - 48% RUS_Karelia_HG + 52% RUS_Krasnoyarsk_BA
d=.034 - 48% RUS_Sidelkino_HG + 52% RUS_Krasnoyarsk_BA
d=.035 - 47% RUS_Khvalynsk_En + 53% RUS_Krasnoyarsk_BA
d=.036 - 52% RUS_Sintashta_MLBA_o3 + 48% RUS_Krasnoyarsk_BA
d=.038 - 47% RUS_Volga-Kama_N + 53% RUS_Krasnoyarsk_BA
d=.040 - 46% RUS_Veretye_Meso + 54% RUS_Krasnoyarsk_BA

vbnetkhio
03-05-2021, 07:23 PM
We can get an extremely good fit for Chuvashes by modeling them as 60% Mari and 40% Mishar (d=.009) or 70% Mari and 30% Mordovian (d=.013). Doesn't it mean that even if Maris are drifted, their drift is shared by Chuvashes?

It's the opposite, Chuvashes have Mari drift.
There are 3 Uralic drifts in G25, a Khanty, Mari and a Nganasan drift. That means if you want low distances, you should use those 3 as sources, even when modelling ancient samples. It's another question if this is historically accurate.

Komintasavalta
03-08-2021, 05:59 PM
It's the opposite, Chuvashes have Mari drift.
There are 3 Uralic drifts in G25, a Khanty, Mari and a Nganasan drift. That means if you want low distances, you should use those 3 as sources, even when modelling ancient samples. It's another question if this is historically accurate.

There's also a cline with Selkups and Kets on PC4 below. If I didn't include Kets, Selkups were at the end of the cline on PC4.

https://i.ibb.co/wBSWM0Y/pc34.png

It's interesting how on PCs 1 and 2, there is a sequence of north-south clusters that progress from west to east. In the first column, there are actually two clusters, but Latvians are at the top of the more northern cluster and Hungarians are at the bottom of the more southern cluster. In the second column, Kola Saami are at the top and Mishars are at the bottom. Next non-Kola Saami are at the top and Bashkirs are at the bottom. Then there's again a column with two or three clusters, where Khanty and Mansi are at the top, Swamp Tatars are in the middle, and other Siberian Tatars and Bashkirs are at the bottom.

https://i.ibb.co/XL30pz9/pc12.png

Then on PC5 Maris are again at one extreme and Kets are at the other extreme:

https://i.ibb.co/2dBDZfT/pc56.png


library(tidyverse)
library(colorspace)

download.file("https://drive.google.com/uc?export=download&id=1HYrDwxEXv82DvDLoq736pS5ZTGJA4dn5","modernind")
t=read.csv("modernind",header=T,row.names=1)
pick=c("Bashkir","Besermyan","Chuvash","Estonian","Finnish","Finnish_East","Hungarian","Ingrian","Karelian","Ket","Khanty","Komi","Latvian","Mansi","Mari","Mordovian","Nenets","Nganassan","Norwegian","Saami","Saami_Kola","Selkup","Swedish","Tatar_Kazan","Tatar_Mishar","Tatar_Siberian","Tatar_Siberian_Zabolotniye","Udmurt","Vepsian")
t=t[sub(":.*","",row.names(t))%in%pick,]

k=cutree(hclust(dist(t)),k=12)
p=prcomp(t)
p2=as.data.frame(p$x)
p2$cluster=as.vector(k)
write.csv(k,"clusters",quote=F)
pct=paste0(colnames(p$x)," (",sprintf("%.1f",p$sdev/sum(p$sdev)*100),"%)")

ggplot(p2,aes(x=-PC1,y=-PC2))+
geom_point(aes(color=as.factor(cluster)),size=.5)+
geom_polygon(data=p2%>%group_by(cluster)%>%slice(chull(PC1,PC2)),alpha=.2,aes(color=as.facto r(cluster),fill=as.factor(cluster)),size=.3)+
geom_text(label=rownames(p2),aes(color=as.factor(c luster)),size=2.2,vjust=-.7)+
theme(
aspect.ratio=3/4,
axis.text=element_text(color="black"),
axis.ticks.length=unit(0,"pt"),
axis.ticks.x=element_blank(),
axis.ticks.y=element_blank(),
legend.position="none",
panel.background=element_rect(fill="white"),
panel.grid.major=element_line(color="gray75",size=.2),
panel.grid.minor=element_line(color="white",size=.13),
plot.background=element_rect(fill="white"),
text=element_text(color="black")

)+
scale_x_continuous(breaks=seq(-1,1,.05),expand=expansion(mult=.12))+
scale_y_continuous(breaks=seq(-1,1,.05),expand=expansion(mult=.08))+
xlab(pct[1])+ylab(pct[2])+
scale_color_discrete_qualitative(palette="Set 2",c=80,l=40)+
ggsave("output.png")

system("/usr/local/bin/mogrify -trim -bordercolor white -border 20x20 output.png")