Page 40 of 42 FirstFirst ... 3036373839404142 LastLast
Results 391 to 400 of 415

Thread: Ukrainian academic regional k13 results - more than 100

  1. #391
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 05:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,910
    Thumbs Up/Down
    Received: 3,464/7
    Given: 1,535/1

    0 Not allowed! Not allowed!

    Default

    Quote Originally Posted by Sandis View Post
    I created a PCA with all the samples except 7 outliers and samples which were posted today.
    Also added oblasts averages and assigned colors for each region: Western, Central, Eastern, Southern.
    Rivne actually cluster with Central oblasts, but i left the original division:

    could you post a larger version?

  2. #392
    The truth is somewhere out Sandis's Avatar
    Join Date
    Feb 2012
    Last Online
    08-06-2025 @ 09:49 PM
    Meta-Ethnicity
    Baltic, Finno-Ugric
    Ethnicity
    Latvian
    Country
    Latvia
    Politics
    Anti-imperialism
    Gender
    Posts
    1,436
    Thumbs Up/Down
    Received: 1,630/13
    Given: 1,115/6

    2 Not allowed! Not allowed!

    Default

    Quote Originally Posted by vbnetkhio View Post
    could you post a larger version?
    Something went wrong, but now it's full linked. Click on link: https://ibb.co/M9msgPX

  3. #393
    Veteran Member
    Join Date
    Jul 2019
    Last Online
    03-11-2024 @ 05:25 PM
    Ethnicity
    Unknown
    Country
    Antarctica
    Gender
    Posts
    3,910
    Thumbs Up/Down
    Received: 3,464/7
    Given: 1,535/1

    0 Not allowed! Not allowed!

    Default

    Quote Originally Posted by Sandis View Post
    Something went wrong, but now it's full linked. Click on link: https://ibb.co/M9msgPX
    this is a weird PCA. all averages seem to plot on the edge of the cluster? in which program did you make it?

    you can try copying this data:
    https://www.theapricity.com/forum/sh...=1#post7179863

    and just paste it here:
    https://vahaduo.github.io/custompca/

    and compare it to this one.

  4. #394
    The truth is somewhere out Sandis's Avatar
    Join Date
    Feb 2012
    Last Online
    08-06-2025 @ 09:49 PM
    Meta-Ethnicity
    Baltic, Finno-Ugric
    Ethnicity
    Latvian
    Country
    Latvia
    Politics
    Anti-imperialism
    Gender
    Posts
    1,436
    Thumbs Up/Down
    Received: 1,630/13
    Given: 1,115/6

    2 Not allowed! Not allowed!

    Default

    Quote Originally Posted by vbnetkhio View Post
    this is a weird PCA. all averages seem to plot on the edge of the cluster? in which program did you make it?

    you can try copying this data:
    https://www.theapricity.com/forum/sh...=1#post7179863

    and just paste it here:
    https://vahaduo.github.io/custompca/

    and compare it to this one.
    For PCA generation i use algorithms which were created by Estonian colleagues: https://biit.cs.ut.ee/
    My Estonian cousins are good at mathematics.

    Central/Eastern/Southern oblasts averages are close to each other, because distances between oblasts are smaller than distances between particular individuals. It's based only on distances, so oblasts also can form a cluster.
    Maybe i shouldn't include averages in the same PCA. I recommend to view individuals and averages as separate entities.
    Of course, it's better when averages are imposed at the center of the cluster.
    Last edited by Sandis; 05-27-2021 at 07:30 PM.

  5. #395
    Banned
    Join Date
    Sep 2020
    Last Online
    09-12-2023 @ 04:47 PM
    Location
    コミ共和国
    Meta-Ethnicity
    Finno-Permic
    Ethnicity
    Peasant
    Ancestry
    コミ
    Country
    Finland
    Taxonomy
    Karaboğa (euryprosopic, platyrrhine, dolichocephalic)
    Relationship Status
    Virgin
    Gender
    Posts
    2,150
    Thumbs Up/Down
    Received: 4,863/123
    Given: 2,945/0

    1 Not allowed! Not allowed!

    Default

    Did you use a distance matrix of the samples as an input for the PCA? Because that produces a U-shaped PCA plot like your plot. You're only supposed to calculate a distance matrix for MDS but not PCA.

    I don't know if you did it already, but in the "Data pre-processing" tab of ClustVis, you also need to change "Row scaling" from "unit variance scaling" to "no scaling". The default behavior is to convert the columns of the input table to z-scores, which is useful when the columns of the table contain data in different scales, like for example in a table of anthropological measurements. But it's not needed here.

    Anyway, below is a plot where I didn't calculate a distance matrix before the PCA. I excluded samples with the suffix "_o" because some of them stretched the PCA too much. There's so much overlap between the regions that I didn't draw hulls around the samples from each region, because there would've been too many overlapping hulls.



    Code:
    library(tidyverse)
    
    t=read.csv("https://pastebin.com/raw/8JLJg8DE",F,row.names=1)/100
    
    t=t[-grep("_o",rownames(t)),]
    t=t[!rownames(t)%in%c("Khmelnytska:Yarmolyntsi_apricity"),]
    
    # t=as.matrix(dist(t)) # compare to a PCA of a distance matrix
    
    p=prcomp(t)
    p2=as.data.frame(p$x)
    pct=paste0(colnames(p$x)," (",sprintf("%.1f",p$sdev/sum(p$sdev)*100),"%)")
    
    ranges=apply(p2,2,function(x)abs(max(x)-min(x)))
    
    pop=sub(":.*","",rownames(p2))
    centers=data.frame(aggregate(p2,by=list(pop),mean),row.names=1)
    p2$pop=pop
    
    set.seed(0)
    color=as.factor(sample(seq(1,length(unique(p2$pop)))))
    col=rbind(c(70,80),c(50,50),c(90,50))
    hues=max(ceiling(length(color)/nrow(col)),2)
    pal1=as.vector(apply(col,1,function(x)hcl(head(seq(15,375,length=hues+1),-1),x[1],x[2])))
    pal2=as.vector(apply(col,1,function(x)hcl(rep(0,hues),0,ifelse(x[2]>=50,0,1))))
    
    i=1
    cen2=data.frame(aggregate(t,by=list(pop),mean),row.names=1)
    dist2=as.data.frame(as.matrix(dist(cen2)))
    seg0=lapply(1:3,function(j)apply(dist2,1,function(x)unlist(centers[names(sort(x)[j]),c(i,i+1)],use.names=F))%>%t%>%cbind(centers[,c(i,i+1)]))
    seg=do.call(rbind,seg0)%>%setNames(paste0("V",1:4))
    
    xpc=sym(paste0("PC",i))
    ypc=sym(paste0("PC",i+1))
    
    ggplot(p2,aes(!!xpc,!!ypc))+
    geom_segment(data=seg,aes(x=V1,y=V2,xend=V3,yend=V4),color="gray80",size=.3)+
    geom_point(aes(x=!!xpc,y=!!ypc),color=pal1[color[as.factor(p2$pop)]],size=.3)+
    geom_text(aes(x=!!xpc,y=!!ypc,label=rownames(p2)),color=pal1[color[as.factor(p2$pop)]],size=2,vjust=-.6)+
    geom_point(data=centers,aes(x=!!xpc,y=!!ypc),color=pal1[color],size=2)+
    geom_label(data=centers,aes(x=!!xpc,y=!!ypc,label=rownames(centers)),color=pal2[color],fill=pal1[color],alpha=.8,size=2.2,label.r=unit(.1,"lines"),label.padding=unit(.1,"lines"),label.size=0)+
    labs(x=pct[i],y=pct[i+1])+
    coord_fixed()+
    scale_x_continuous(breaks=seq(-1,1,.05))+
    scale_y_continuous(breaks=seq(-1,1,.05))+
    theme(
      axis.text.x=element_text(margin=margin(.2,0,0,0,"cm")),
      axis.text.y=element_text(angle=90,vjust=1,hjust=.5,margin=margin(0,.2,0,0,"cm")),
      axis.text=element_text(color="black",size=6),
      axis.ticks.length=unit(-.13,"cm"),
      axis.ticks=element_line(size=.3,color="gray80"),
      axis.title=element_text(color="black",size=8),
      legend.position="none",
      panel.background=element_rect(fill="white"),
      panel.border=element_rect(color="gray80",fill=NA,size=.6),
      panel.grid.major=element_line(color="white",size=.2),
      panel.grid.minor=element_blank(),
      plot.background=element_rect(fill="white",color=NA)
    )
    
    ggsave("a.png",width=11,height=11)
    system("mogrify -trim -bordercolor white -border 16 a.png")
    Last edited by Komintasavalta; 05-28-2021 at 01:16 PM.

  6. #396
    The truth is somewhere out Sandis's Avatar
    Join Date
    Feb 2012
    Last Online
    08-06-2025 @ 09:49 PM
    Meta-Ethnicity
    Baltic, Finno-Ugric
    Ethnicity
    Latvian
    Country
    Latvia
    Politics
    Anti-imperialism
    Gender
    Posts
    1,436
    Thumbs Up/Down
    Received: 1,630/13
    Given: 1,115/6

    0 Not allowed! Not allowed!

    Default

    Quote Originally Posted by Komintasavalta View Post
    Did you use a distance matrix of the samples as an input for the PCA? Because that produces a U-shaped PCA plot like your plot. You're only supposed to calculate a distance matrix for MDS but not PCA.

    I don't know if you did it already, but in the "Data pre-processing" tab of ClustVis, you also need to change "Row scaling" from "unit variance scaling" to "no scaling". The default behavior is to convert the columns of the input table to z-scores, which is useful when the columns of the table contain data in different scales, like for example in a table of anthropological measurements. But it's not needed here.

    Anyway, below is a plot where I didn't calculate a distance matrix before the PCA. I excluded samples with the suffix "_o" because some of them stretched the PCA too much. There's so much overlap between the regions that I didn't draw hulls around the samples from each region, because there would've been too many overlapping hulls.



    Code:
    library(tidyverse)
    
    t=read.csv("https://pastebin.com/raw/8JLJg8DE",F,row.names=1)/100
    
    t=t[-grep("_o",rownames(t)),]
    t=t[!rownames(t)%in%c("Khmelnytska:Yarmolyntsi_apricity"),]
    
    # t=as.matrix(dist(t)) # compare to a PCA of a distance matrix
    
    p=prcomp(t)
    p2=as.data.frame(p$x)
    pct=paste0(colnames(p$x)," (",sprintf("%.1f",p$sdev/sum(p$sdev)*100),"%)")
    
    ranges=apply(p2,2,function(x)abs(max(x)-min(x)))
    
    pop=sub(":.*","",rownames(p2))
    centers=data.frame(aggregate(p2,by=list(pop),mean),row.names=1)
    p2$pop=pop
    
    set.seed(0)
    color=as.factor(sample(seq(1,length(unique(p2$pop)))))
    col=rbind(c(70,80),c(50,50),c(90,50))
    hues=max(ceiling(length(color)/nrow(col)),2)
    pal1=as.vector(apply(col,1,function(x)hcl(head(seq(15,375,length=hues+1),-1),x[1],x[2])))
    pal2=as.vector(apply(col,1,function(x)hcl(rep(0,hues),0,ifelse(x[2]>=50,0,1))))
    
    i=1
    cen2=data.frame(aggregate(t,by=list(pop),mean),row.names=1)
    dist2=as.data.frame(as.matrix(dist(cen2)))
    seg0=lapply(1:3,function(j)apply(dist2,1,function(x)unlist(centers[names(sort(x)[j]),c(i,i+1)],use.names=F))%>%t%>%cbind(centers[,c(i,i+1)]))
    seg=do.call(rbind,seg0)%>%setNames(paste0("V",1:4))
    
    xpc=sym(paste0("PC",i))
    ypc=sym(paste0("PC",i+1))
    
    ggplot(p2,aes(!!xpc,!!ypc))+
    geom_segment(data=seg,aes(x=V1,y=V2,xend=V3,yend=V4),color="gray80",size=.3)+
    geom_point(aes(x=!!xpc,y=!!ypc),color=pal1[color[as.factor(p2$pop)]],size=.3)+
    geom_text(aes(x=!!xpc,y=!!ypc,label=rownames(p2)),color=pal1[color[as.factor(p2$pop)]],size=2,vjust=-.6)+
    geom_point(data=centers,aes(x=!!xpc,y=!!ypc),color=pal1[color],size=2)+
    geom_label(data=centers,aes(x=!!xpc,y=!!ypc,label=rownames(centers)),color=pal2[color],fill=pal1[color],alpha=.8,size=2.2,label.r=unit(.1,"lines"),label.padding=unit(.1,"lines"),label.size=0)+
    labs(x=pct[i],y=pct[i+1])+
    coord_fixed()+
    scale_x_continuous(breaks=seq(-1,1,.05))+
    scale_y_continuous(breaks=seq(-1,1,.05))+
    theme(
      axis.text.x=element_text(margin=margin(.2,0,0,0,"cm")),
      axis.text.y=element_text(angle=90,vjust=1,hjust=.5,margin=margin(0,.2,0,0,"cm")),
      axis.text=element_text(color="black",size=6),
      axis.ticks.length=unit(-.13,"cm"),
      axis.ticks=element_line(size=.3,color="gray80"),
      axis.title=element_text(color="black",size=8),
      legend.position="none",
      panel.background=element_rect(fill="white"),
      panel.border=element_rect(color="gray80",fill=NA,size=.6),
      panel.grid.major=element_line(color="white",size=.2),
      panel.grid.minor=element_blank(),
      plot.background=element_rect(fill="white",color=NA)
    )
    
    ggsave("a.png",width=11,height=11)
    system("mogrify -trim -bordercolor white -border 16 a.png")
    I got rounder shape. I will do remake. Original titles are too long, so it's hard to read, i replace them with shorter.


  7. #397
    Banned
    Join Date
    Jun 2017
    Last Online
    10-29-2025 @ 12:41 PM
    Ethnicity
    Healthy human being
    Country
    Moldova
    Gender
    Posts
    5,572
    Thumbs Up/Down
    Received: 5,514/44
    Given: 1,505/11

    2 Not allowed! Not allowed!

    Default

    From Vinnytsia oblast (Bridok village)

    Bridok,27.02,46,8.59,7.28,3.25,3.1,2.29,0,0.81,0,0 .11,0.66,0.88

    Distance to: Bridok
    3.75664744 Russian_Smolensk
    4.82831233 Ukrainian
    4.84645231 Russian_Southwest
    5.05186104 Polish
    5.07276059 Polish_Kielce
    5.57916660 Belarusian_Minsk
    5.59009839 Russian_average
    6.19462670 Belorussian
    7.38395558 Polish_Masuria
    7.39628961 South_Polish
    7.68490729 Russian_Kostroma
    8.00220595 Mordovian
    8.01975062 Russian_Kargopol
    8.55432055 Moldova_Ukrainian
    8.81542398 Ukrainian_Galicia
    9.48556271 Lithuanian
    9.48887243 Sorb_Lusatia
    10.19104999 Russian_Northern_Dvina
    10.53536426 Estonian
    11.56948141 Ukrainian_Carpathian
    11.93811962 Slovak
    12.19875403 Finnish
    12.34531085 East_Finnish
    12.59966666 Czech
    13.86463847 Latvian

    https://anthrogenica.com/showthread....ch-in-comments

  8. #398
    Banned
    Join Date
    Jun 2017
    Last Online
    10-29-2025 @ 12:41 PM
    Ethnicity
    Healthy human being
    Country
    Moldova
    Gender
    Posts
    5,572
    Thumbs Up/Down
    Received: 5,514/44
    Given: 1,505/11

    0 Not allowed! Not allowed!

    Default

    Russian Old Believers (Lipovans) from Vinnytsia oblast

    North_Atlantic 26.67 Pct
    Baltic 51.35 Pct
    West_Med 8.95 Pct
    West_Asian 4.87 Pct
    East_Med 3.42 Pct
    Red_Sea -
    South_Asian -
    East_Asian -
    Siberian 2.62 Pct
    Amerindian 1.01 Pct
    Oceanian -
    Northeast_African 0.42 Pct
    Sub-Saharan 0.67 Pc


    North_Atlantic 22.55 Pct
    Baltic 51.89 Pct
    West_Med 12.59 Pct
    West_Asian 3.45 Pct
    East_Med 4.5 Pct
    Red_Sea -
    South_Asian -
    East_Asian -
    Siberian 2.61 Pct
    Amerindian 0.88 Pct
    Oceanian 0.21 Pct
    Northeast_African 0.29 Pct
    Sub-Saharan 1.03 Pc


    North_Atlantic 26.43 Pct
    Baltic 51.14 Pct
    West_Med 11.11 Pct
    West_Asian 5.56 Pct
    East_Med -
    Red_Sea -
    South_Asian 1.05 Pct
    East_Asian 0.32 Pct
    Siberian 1.61 Pct
    Amerindian 1.37 Pct
    Oceanian 0.57 Pct
    Northeast_African 0.84 Pct
    Sub-Saharan -


    North_Atlantic 30.14 Pct
    Baltic 49.7 Pct
    West_Med 6.47 Pct
    West_Asian 1.2 Pct
    East_Med 5.93 Pct
    Red_Sea 1.72 Pct
    South_Asian 0.81 Pct
    East_Asian -
    Siberian 2.04 Pct
    Amerindian 0.34 Pct
    Oceanian 1.65 Pct
    Northeast_African -
    Sub-Saharan -

  9. #399
    Member
    Join Date
    Oct 2021
    Last Online
    04-01-2024 @ 01:45 PM
    Meta-Ethnicity
    Slavic, Romance
    Ethnicity
    Ukrainian, Romanian
    Country
    Canada
    Gender
    Posts
    100
    Thumbs Up/Down
    Received: 32/0
    Given: 18/0

    0 Not allowed! Not allowed!

    Default

    Quote Originally Posted by vbnetkhio View Post
    This is my main takeaway from this data:


    from this PCA I removed the regions which seemed to be outlying because of a low sample number.

    Northern(Polesye) and Eastern Ukraine are very homogenous, and probably identical to proto-Slavs.
    Southwestern Ukraine is pulled towards a mix of Hungarians and Romanians best represented by the Hungarian-Transylvania average.

    It seems this admixture isn't just limited to the border areas, it spreads much deeper into the country than I thought.

    Data is missing from the area between Moldova and Dnipropetrovsk.
    If you read about the population movements in Ukraine (and everywhere else), it is impossible for any population to represent one from 1500 years ago.

    Vast majority of Ukraine around 1600 including the Kyiv region (everything except the west and southern Podolia) was completely unpopulated after the Mongol invasions and occupation and then repopulated by people from western Ukraine, Poland, Belarus, and later Russia. Not to mention the frequent immigration from Poland to the west or from Russian serfs (after mixing with native Russian populations) into the Cossacks.

    The reason why we might see the north and east cluster together is perhaps because similar groups came together and blended to form a similar average. At least with regard to the parameters that are measured in tests like this.

  10. #400
    Member
    Join Date
    Oct 2021
    Last Online
    04-01-2024 @ 01:45 PM
    Meta-Ethnicity
    Slavic, Romance
    Ethnicity
    Ukrainian, Romanian
    Country
    Canada
    Gender
    Posts
    100
    Thumbs Up/Down
    Received: 32/0
    Given: 18/0

    0 Not allowed! Not allowed!

    Default

    Quote Originally Posted by Stearsolina View Post
    New Ukrainian average comes closest to SW Russians, and I think it's excellent modern proxy for early Slavic input.
    It clusters somewhere between Lithuanians and North Moldovans, so between Balts and northern Balkanites, I think that's what early Slavs were like.

    Belarausians, Russians Smolensk and Masurians IMO are Baltic admixed, while Polish average has slight east Germanic input (still very Slavic nevertheless)

    South Slavs strongly prefer Ukrainians as their Slavic proxy in my modeling, and Vbn did excellent thing to opt out for average of main Ukrainian genetic cluster instead of all-country average which wouldn't be so pure.
    Again, trying to establish any purity this way is a bit ignorant. The population movements have been far too significant for this sense of purity to be realistic.

    There are different reasons why the southwest would plot differently than the rest of the nation.

    1) Russians from far north did not colonize the southwest like they did regions in the north, east, and south
    2) Masses of Russian serfs did not move to the southwest to join the Cossacks like they did in other regions
    3) The southwest was not part of the Russian Empire from the 1700s until the 1900s, so there wasn't as much northern genetic addition during that period for that region
    4) The southwest has always been very population dense even going back to the middle ages, so modern population movements to there from, for example, Poland or pre-Soviet Russia or even the Soviet Union weren't going to have as much of an effect as they would in historically sparsely populated regions, where it was easy to completely displace whatever native population existed.

    Galicia has actually had one of the more stable populations in all of Ukraine whereas most of the present nation is made up largely of more modern people moving there, mostly from more northern regions as Ukraine was typically at the southern periphery of whatever empire it was a part of (on that note, this probably even occurred to some extent during the Kievan Rus period as Ukraine was then also at the southern periphery of its empire). For linguistic reasons, it was also more likely for a Belarusian or Russian to move to neighbouring parts of Ukraine during the Soviet era than for a Romance speaker from the south to move north. In most of the regions of Ukraine, there was much more potential for the addition of more northern admixture than there was for more southern admixture.

    These two-dimensional plots that you guys are posting are just pseudoscience because, if you take a Polish person who we presume is 1/2 Slavic + 1/2 German and mix them with a Russian who we presume is 1/2 Slavic + 1/2 Finno-Ugric, we end up with someone who plots directly in the middle who you might assume looks 100% super-Slavic. But, in this case, the resulting combination would still only be 1/2 Slavic. In most of Ukraine, people mixed from southwestern Ukraine, Poland, Belarus, and especially Russia who contributed massively to the modern Ukrainian gene pool. This naturally resulted in an intermediate population who plots between the others, but this intermediate is no more Slavic than its constituent parts. K13 obviously has these same limitations and it is rather useless if not considered with historical context at the very least. And it definitely can't be used to establish proxies of populations that existed 1500 years ago. Especially somewhere like Ukraine that underwent so many population shifts and where people were constantly moving around and mixing from different areas.
    Last edited by Vega7; 02-22-2022 at 05:48 PM.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Post your regional K13 results!
    By Jana in forum Autosomal DNA
    Replies: 32
    Last Post: 04-28-2021, 10:14 AM
  2. 76 Hungarian academic samples - k13 results
    By vbnetkhio in forum Autosomal DNA
    Replies: 55
    Last Post: 02-14-2021, 06:35 PM
  3. Academic German samples GEDmatch results
    By Peterski in forum Genetics
    Replies: 11
    Last Post: 01-16-2021, 10:15 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •