0


| Thumbs Up/Down |
| Received: 3,464/7 Given: 1,535/1 |


| Thumbs Up/Down |
| Received: 1,630/13 Given: 1,115/6 |


| Thumbs Up/Down |
| Received: 3,464/7 Given: 1,535/1 |
this is a weird PCA. all averages seem to plot on the edge of the cluster? in which program did you make it?
you can try copying this data:
https://www.theapricity.com/forum/sh...=1#post7179863
and just paste it here:
https://vahaduo.github.io/custompca/
and compare it to this one.


| Thumbs Up/Down |
| Received: 1,630/13 Given: 1,115/6 |
For PCA generation i use algorithms which were created by Estonian colleagues: https://biit.cs.ut.ee/
My Estonian cousins are good at mathematics.
Central/Eastern/Southern oblasts averages are close to each other, because distances between oblasts are smaller than distances between particular individuals. It's based only on distances, so oblasts also can form a cluster.
Maybe i shouldn't include averages in the same PCA. I recommend to view individuals and averages as separate entities.
Of course, it's better when averages are imposed at the center of the cluster.
Last edited by Sandis; 05-27-2021 at 07:30 PM.


| Thumbs Up/Down |
| Received: 4,863/123 Given: 2,945/0 |
Did you use a distance matrix of the samples as an input for the PCA? Because that produces a U-shaped PCA plot like your plot. You're only supposed to calculate a distance matrix for MDS but not PCA.
I don't know if you did it already, but in the "Data pre-processing" tab of ClustVis, you also need to change "Row scaling" from "unit variance scaling" to "no scaling". The default behavior is to convert the columns of the input table to z-scores, which is useful when the columns of the table contain data in different scales, like for example in a table of anthropological measurements. But it's not needed here.
Anyway, below is a plot where I didn't calculate a distance matrix before the PCA. I excluded samples with the suffix "_o" because some of them stretched the PCA too much. There's so much overlap between the regions that I didn't draw hulls around the samples from each region, because there would've been too many overlapping hulls.
Code:library(tidyverse) t=read.csv("https://pastebin.com/raw/8JLJg8DE",F,row.names=1)/100 t=t[-grep("_o",rownames(t)),] t=t[!rownames(t)%in%c("Khmelnytska:Yarmolyntsi_apricity"),] # t=as.matrix(dist(t)) # compare to a PCA of a distance matrix p=prcomp(t) p2=as.data.frame(p$x) pct=paste0(colnames(p$x)," (",sprintf("%.1f",p$sdev/sum(p$sdev)*100),"%)") ranges=apply(p2,2,function(x)abs(max(x)-min(x))) pop=sub(":.*","",rownames(p2)) centers=data.frame(aggregate(p2,by=list(pop),mean),row.names=1) p2$pop=pop set.seed(0) color=as.factor(sample(seq(1,length(unique(p2$pop))))) col=rbind(c(70,80),c(50,50),c(90,50)) hues=max(ceiling(length(color)/nrow(col)),2) pal1=as.vector(apply(col,1,function(x)hcl(head(seq(15,375,length=hues+1),-1),x[1],x[2]))) pal2=as.vector(apply(col,1,function(x)hcl(rep(0,hues),0,ifelse(x[2]>=50,0,1)))) i=1 cen2=data.frame(aggregate(t,by=list(pop),mean),row.names=1) dist2=as.data.frame(as.matrix(dist(cen2))) seg0=lapply(1:3,function(j)apply(dist2,1,function(x)unlist(centers[names(sort(x)[j]),c(i,i+1)],use.names=F))%>%t%>%cbind(centers[,c(i,i+1)])) seg=do.call(rbind,seg0)%>%setNames(paste0("V",1:4)) xpc=sym(paste0("PC",i)) ypc=sym(paste0("PC",i+1)) ggplot(p2,aes(!!xpc,!!ypc))+ geom_segment(data=seg,aes(x=V1,y=V2,xend=V3,yend=V4),color="gray80",size=.3)+ geom_point(aes(x=!!xpc,y=!!ypc),color=pal1[color[as.factor(p2$pop)]],size=.3)+ geom_text(aes(x=!!xpc,y=!!ypc,label=rownames(p2)),color=pal1[color[as.factor(p2$pop)]],size=2,vjust=-.6)+ geom_point(data=centers,aes(x=!!xpc,y=!!ypc),color=pal1[color],size=2)+ geom_label(data=centers,aes(x=!!xpc,y=!!ypc,label=rownames(centers)),color=pal2[color],fill=pal1[color],alpha=.8,size=2.2,label.r=unit(.1,"lines"),label.padding=unit(.1,"lines"),label.size=0)+ labs(x=pct[i],y=pct[i+1])+ coord_fixed()+ scale_x_continuous(breaks=seq(-1,1,.05))+ scale_y_continuous(breaks=seq(-1,1,.05))+ theme( axis.text.x=element_text(margin=margin(.2,0,0,0,"cm")), axis.text.y=element_text(angle=90,vjust=1,hjust=.5,margin=margin(0,.2,0,0,"cm")), axis.text=element_text(color="black",size=6), axis.ticks.length=unit(-.13,"cm"), axis.ticks=element_line(size=.3,color="gray80"), axis.title=element_text(color="black",size=8), legend.position="none", panel.background=element_rect(fill="white"), panel.border=element_rect(color="gray80",fill=NA,size=.6), panel.grid.major=element_line(color="white",size=.2), panel.grid.minor=element_blank(), plot.background=element_rect(fill="white",color=NA) ) ggsave("a.png",width=11,height=11) system("mogrify -trim -bordercolor white -border 16 a.png")
Last edited by Komintasavalta; 05-28-2021 at 01:16 PM.




| Thumbs Up/Down |
| Received: 5,514/44 Given: 1,505/11 |
From Vinnytsia oblast (Bridok village)
Bridok,27.02,46,8.59,7.28,3.25,3.1,2.29,0,0.81,0,0 .11,0.66,0.88
Distance to: Bridok
3.75664744 Russian_Smolensk
4.82831233 Ukrainian
4.84645231 Russian_Southwest
5.05186104 Polish
5.07276059 Polish_Kielce
5.57916660 Belarusian_Minsk
5.59009839 Russian_average
6.19462670 Belorussian
7.38395558 Polish_Masuria
7.39628961 South_Polish
7.68490729 Russian_Kostroma
8.00220595 Mordovian
8.01975062 Russian_Kargopol
8.55432055 Moldova_Ukrainian
8.81542398 Ukrainian_Galicia
9.48556271 Lithuanian
9.48887243 Sorb_Lusatia
10.19104999 Russian_Northern_Dvina
10.53536426 Estonian
11.56948141 Ukrainian_Carpathian
11.93811962 Slovak
12.19875403 Finnish
12.34531085 East_Finnish
12.59966666 Czech
13.86463847 Latvian
https://anthrogenica.com/showthread....ch-in-comments




| Thumbs Up/Down |
| Received: 5,514/44 Given: 1,505/11 |
Russian Old Believers (Lipovans) from Vinnytsia oblast
North_Atlantic 26.67 Pct
Baltic 51.35 Pct
West_Med 8.95 Pct
West_Asian 4.87 Pct
East_Med 3.42 Pct
Red_Sea -
South_Asian -
East_Asian -
Siberian 2.62 Pct
Amerindian 1.01 Pct
Oceanian -
Northeast_African 0.42 Pct
Sub-Saharan 0.67 Pc
North_Atlantic 22.55 Pct
Baltic 51.89 Pct
West_Med 12.59 Pct
West_Asian 3.45 Pct
East_Med 4.5 Pct
Red_Sea -
South_Asian -
East_Asian -
Siberian 2.61 Pct
Amerindian 0.88 Pct
Oceanian 0.21 Pct
Northeast_African 0.29 Pct
Sub-Saharan 1.03 Pc
North_Atlantic 26.43 Pct
Baltic 51.14 Pct
West_Med 11.11 Pct
West_Asian 5.56 Pct
East_Med -
Red_Sea -
South_Asian 1.05 Pct
East_Asian 0.32 Pct
Siberian 1.61 Pct
Amerindian 1.37 Pct
Oceanian 0.57 Pct
Northeast_African 0.84 Pct
Sub-Saharan -
North_Atlantic 30.14 Pct
Baltic 49.7 Pct
West_Med 6.47 Pct
West_Asian 1.2 Pct
East_Med 5.93 Pct
Red_Sea 1.72 Pct
South_Asian 0.81 Pct
East_Asian -
Siberian 2.04 Pct
Amerindian 0.34 Pct
Oceanian 1.65 Pct
Northeast_African -
Sub-Saharan -


| Thumbs Up/Down |
| Received: 32/0 Given: 18/0 |
If you read about the population movements in Ukraine (and everywhere else), it is impossible for any population to represent one from 1500 years ago.
Vast majority of Ukraine around 1600 including the Kyiv region (everything except the west and southern Podolia) was completely unpopulated after the Mongol invasions and occupation and then repopulated by people from western Ukraine, Poland, Belarus, and later Russia. Not to mention the frequent immigration from Poland to the west or from Russian serfs (after mixing with native Russian populations) into the Cossacks.
The reason why we might see the north and east cluster together is perhaps because similar groups came together and blended to form a similar average. At least with regard to the parameters that are measured in tests like this.


| Thumbs Up/Down |
| Received: 32/0 Given: 18/0 |
Again, trying to establish any purity this way is a bit ignorant. The population movements have been far too significant for this sense of purity to be realistic.
There are different reasons why the southwest would plot differently than the rest of the nation.
1) Russians from far north did not colonize the southwest like they did regions in the north, east, and south
2) Masses of Russian serfs did not move to the southwest to join the Cossacks like they did in other regions
3) The southwest was not part of the Russian Empire from the 1700s until the 1900s, so there wasn't as much northern genetic addition during that period for that region
4) The southwest has always been very population dense even going back to the middle ages, so modern population movements to there from, for example, Poland or pre-Soviet Russia or even the Soviet Union weren't going to have as much of an effect as they would in historically sparsely populated regions, where it was easy to completely displace whatever native population existed.
Galicia has actually had one of the more stable populations in all of Ukraine whereas most of the present nation is made up largely of more modern people moving there, mostly from more northern regions as Ukraine was typically at the southern periphery of whatever empire it was a part of (on that note, this probably even occurred to some extent during the Kievan Rus period as Ukraine was then also at the southern periphery of its empire). For linguistic reasons, it was also more likely for a Belarusian or Russian to move to neighbouring parts of Ukraine during the Soviet era than for a Romance speaker from the south to move north. In most of the regions of Ukraine, there was much more potential for the addition of more northern admixture than there was for more southern admixture.
These two-dimensional plots that you guys are posting are just pseudoscience because, if you take a Polish person who we presume is 1/2 Slavic + 1/2 German and mix them with a Russian who we presume is 1/2 Slavic + 1/2 Finno-Ugric, we end up with someone who plots directly in the middle who you might assume looks 100% super-Slavic. But, in this case, the resulting combination would still only be 1/2 Slavic. In most of Ukraine, people mixed from southwestern Ukraine, Poland, Belarus, and especially Russia who contributed massively to the modern Ukrainian gene pool. This naturally resulted in an intermediate population who plots between the others, but this intermediate is no more Slavic than its constituent parts. K13 obviously has these same limitations and it is rather useless if not considered with historical context at the very least. And it definitely can't be used to establish proxies of populations that existed 1500 years ago. Especially somewhere like Ukraine that underwent so many population shifts and where people were constantly moving around and mixing from different areas.
Last edited by Vega7; 02-22-2022 at 05:48 PM.
There are currently 3 users browsing this thread. (0 members and 3 guests)
Bookmarks